Beginner Questions.


 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions Beginner Questions.
# 1  
Old 06-02-2010
Beginner Questions.

This is the Test_Data.snp file: MEGAUPLOAD - The leading online storage and file delivery service

1. The problem statement, all variables and given/known data:
Problem Set:

Before you get started working with these challenges, be aware that the first challenge is reformatting the test data file so that you get rid of the ‘header' and get all of the columns
delimited for working with in unix. (I'll give you another clue in addition to getting rid of the header, learn ‘grep', ‘cat', ‘cut', ‘awk', ‘sed' )

write a script to change the extension of your file : Test_Data.snp to Test_Data.txt
print all lines that have an ‘A' base call either in the reference (column 2) or query (column 3) strain
print only column titled ‘LEN R' to a new file called Reference_length.txt
sort the file by column 4 ( titled [P2])
print only the lines that have a basecall in columns 2 and 3 (under [SUB] headings) and sort by [LEN R] , output to new file called snp_report.txt


2. Relevant commands, code, scripts, algorithms:

I'm not sure what this means?

3. The attempts at a solution (include all code and scripts):

The only thing I know how to do is actually show the data set in the terminal window


4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):

This is part of a learning scholarship over the summer. I am working with Dr. Mia Champion of TGEN North in Flagstaff. She recommended that I come here for help.


Thanks for any help you can provide. I literally just started learning this a day ago, so please bear with me.
# 2  
Old 06-02-2010
You might want to post a sample of the file layout next time, rather than ask we download your whole file. Otherwise, the following should answer most, if not all in succession, but just so you're aware: there's always more than one way to do it.

It's now up to you to actually deconstruct them per your study guide(s) or texts. HTH.

Code:
mv Test_Data.snp Test_Data.txt

Code:
awk ' $2 ~ /A/ || $3 ~ /A/ { print $0;} ' Test_Data.snp

Code:
awk '{print $9;}' Test_Data.snp >Reference_length.txt

Code:
sort -n -k4 <Test_Data.snp

Code:
awk ' $2 !~ /\./ && $3 !~ /\./ { print $0; }' Test_Data.snp >snp_report.txt

In case anyone else might want to offer something:

Code:
$ head -20 Test_Data.snp #|tail +6 |awk ' $2 !~ /\./ && $3 !~ /\./ { print $0; }'

NUCMER

    [P1]  [SUB]  [P2]      |   [BUFF]   [DIST]  |  [LEN R]  [LEN Q]  | [FRM]  [TAGS]
========================================================================================
       7   A .   1892597   |        7        7  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     140   C T   1892730   |        2       90  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     142   T A   1892732   |        2       88  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     153   A G   1892743   |       11       77  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     213   A G   1892803   |       17       17  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     630   T C   401       |      175      401  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
     805   G A   576       |      175      576  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    1054   C T   825       |      249      825  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    2960   . G   2732      |       77     2732  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    3037   G A   2809      |       77     2809  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    3329   A C   3101      |      104     3101  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    4354   A G   1832816   |       67     4354  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    4421   C A   1832883   |       27     4421  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    4448   T C   1832910   |       27     4448  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella
    4539   A G   1833001   |       17     4539  |  1895994  1892819  |  1  1  LVS_Francisella   SchuS4_Francisella


Last edited by curleb; 06-02-2010 at 11:15 PM..
This User Gave Thanks to curleb For This Post:
# 3  
Old 06-03-2010
Thank you. I really appreciate it. I've been having a tough time not only figuring out the problem set, but asking for help on the forums. There's just a lot of jargon that I simply don't know. I appreciate your understanding and help.

---------- Post updated at 07:09 PM ---------- Previous update was at 01:38 AM ----------

I just got all the outputs I wanted except I'm still not sure how to "delimit" and remove the header so I can use the data in UNIX?

Can anyone help?

Thanks a lot!
# 4  
Old 06-04-2010
do u want to remove the header columns ?
Code:
NUCMER

from ur column ???

if this is what u want?? (as in if nucmer is ur header) ??
# 5  
Old 06-07-2010
Honestly, I don't know. I didn't get a lot of info on the problem. Sorry :/

---------- Post updated 06-07-10 at 12:00 PM ---------- Previous update was 06-06-10 at 09:03 PM ----------

My mentor told me that to get trid of the header I had to use

sed '1,5d'

but I'm not sure how to implement that for the problem.
# 6  
Old 06-07-2010
pretty much in the same way as the following does the same with the tail command:
Code:
head -20 Test_Data.snp |tail +6

In your case, you'd pipe it through to the following:
Code:
head -20 Test_Data.snp |sed '1,5d'

You could also use either directly, such as follows (which is actually more efficient):
Code:
tail +6 Test_Data.snp

Code:
sed '1,5d' Test_Data.snp

Best to think of this effort as a sandbox and get dirty playing...not likely you're in a place to muck too much up.
# 7  
Old 06-11-2010
Code:
sed '1,5d'

the sed is anther editor in itself. moreover, this command will delete the 1st line to 5th line of the file from the top .
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

[Beginner's questions] Filename Validation & Parsing

Hi !! I'm rather new both to the UNIX and scripting worlds, and I'm learning the ropes of scripting. Having said this, please excuse me if you notice certain basic errors. I'm working on a script that implements .jar and .war files for a WAS environment and I need to perform certain... (4 Replies)
Discussion started by: levaldez
4 Replies

2. Shell Programming and Scripting

beginner scripting questions User variables

If there's anywhere to look this up, it would be just as helpful. I googled and really couldn't find anything relative to this. ok... General Variables 1) When creating a script I made a file "prog1.sh" does it matter if the end is .sh or is this what has to be done like prog.bash or... (4 Replies)
Discussion started by: austing5
4 Replies

3. AIX

Beginner's questions about AIX (6.1)

Hello, For some time I have intellistation 9111-285 and I installed AIX 6.1 on it. As a complete beginner I have 2 questions in general about AIX and two specific: 1. is the SMS (system management services) part of AIX? As I noticed when I had Yellowdog Linux installed they weren't available?... (2 Replies)
Discussion started by: kenashkov
2 Replies

4. Homework & Coursework Questions

Print questions from a questions folder in a sequential order

1.) I am to write scripts that will be phasetest folder in the home directory. 2.) The folder should have a set-up,phase and display files I have written a small script which i used to check for the existing users and their password. What I need help with: I have a set of questions in a... (19 Replies)
Discussion started by: moraks007
19 Replies

5. UNIX for Dummies Questions & Answers

Beginner - What Should I Do First?

Hi people.... I have just started to learn unix.I want to know which version of Unix to install plus how to install it.I need to practise and make myself aware of how unix works.My thread is from an educational point of view.Also please feel free to give your suggestions as I am... (3 Replies)
Discussion started by: amit.kanade1983
3 Replies

6. Linux

Beginner questions about versions and installing

1. I have never used or dealt with unix, linux, or any variation thereof, and want to. My biggest problem is that all the versions I've looked at want you to install from a CD or DVD, but I'm wanting to put it onto an Asus eeepc, which has no such drive. How would I go about installing on it? ... (4 Replies)
Discussion started by: lemming
4 Replies

7. UNIX for Dummies Questions & Answers

Absolute Beginner Questions

... before you role your eyes, I picked up my first Unix book 3 days ago! As such, I have a few quick questions that I'm sure are super easy for everyone out there but me! Forgive me if the terminology I use is wrong ... I'm accessing a remote Unix server, I can make my way around directories... (2 Replies)
Discussion started by: joey_tomatoes
2 Replies

8. UNIX for Dummies Questions & Answers

Beginner Questions

Hi everyone. I guess I am the new guy, and also new to Unix. I purchased a box of computer supplies at an auction, and found an unopened box of Compaq Smartstart. So here is what I have... Smartstart 2.5, Netware, Windows NT, OS2 & Lan Server, SCO Open Server release, SCO Unixware 2, Oracle 7 for... (3 Replies)
Discussion started by: Darin
3 Replies

9. Programming

Beginner C

Anyone know where I can get started in C++ programming in unix? Any good free tutorials or websites to start at? I am okay in unix scripting but have never done c programming of any sort... What are the main advantages of using C++ ? (2 Replies)
Discussion started by: frustrated1
2 Replies
Login or Register to Ask a Question