Creating/ammending Name Column in existing .txt file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Creating/ammending Name Column in existing .txt file
# 1  
Old 07-09-2008
Creating/ammending Name Column in existing .txt file

With the help of this forum, I have a script with the following output:

chr7 27104483 27105154
chr7 27106872 27110789
chr7 27111956 27112830
chr7 27114388 27125180
chr7 27126966 27131260
chr7 27135440 27137796

which was created by the following script:

awk '1 == NR || $NF >= 1000 {
if (c) print _, __
_ = $1 FS $2
c = 1
}
{ __ = $3}
END {
print _, __} ' tester.txt


Now I need to add a fourth column to the output which contains the first 6 characters of the fourth column of the input file then the word "Cluster" and the number of the current line. I have it now so it just prints the 4th column of the input file in the output, but I'm not sure how to create the counter required and then manipulate the name of the input column to just grab the first 6 characters. Here is an example of what I have and what I need:

INPUT:

chr7 27104483 27104633 ENm010Block71 150 0
chr7 27104634 27104812 ENm010Block72 178 0
chr7 27104813 27105154 ENm010Block73 341 0
chr7 27106872 27106977 ENm010Block74 105 1717
chr7 27106978 27107481 ENm010Block75 503 0
chr7 27107482 27108156 ENm010Block76 674 0
chr7 27108157 27108194 ENm010Block77 37 0
chr7 27108422 27108700 ENm010Block78 278 227
chr7 27109258 27109365 ENm010Block79 107 557
chr7 27109366 27109431 ENm010Block80 65 0
chr7 27109432 27110017 ENm010Block81 585 0
chr7 27110018 27110056 ENm010Block82 38 0
chr7 27110057 27110309 ENm010Block83 252 0
chr7 27110310 27110435 ENm010Block84 125 0
chr7 27110436 27110489 ENm010Block85 53 0
chr7 27110490 27110550 ENm010Block86 60 0
chr7 27110551 27110789 ENm010Block87 238 0
chr7 27111956 27112348 ENm010Block88 392 1166
chr7 27112374 27112830 ENm010Block89 456 25
chr7 27114388 27114881 ENm010Block90 493 1557
chr7 27114882 27115338 ENm010Block91 456 0
chr7 27115339 27115870 ENm010Block92 531 0
chr7 27116098 27116173 ENm010Block93 75 227
chr7 27116174 27116705 ENm010Block94 531 0
chr7 27116706 27116755 ENm010Block95 49 0
chr7 27116756 27116781 ENm010Block96 25 0
chr7 27116782 27116945 ENm010Block97 163 0
chr7 27116946 27117276 ENm010Block98 330 0
chr7 27117277 27117960 ENm010Block99 683 0
chr7 27118910 27119137 ENm010Block100 227 949
chr7 27119138 27119213 ENm010Block101 75 0
chr7 27119214 27119365 ENm010Block102 151 0
chr7 27119366 27119783 ENm010Block103 417 0
chr7 27119784 27119822 ENm010Block104 38 0
chr7 27119823 27119948 ENm010Block105 125 0
chr7 27119949 27119985 ENm010Block106 36 0
chr7 27119986 27120353 ENm010Block107 367 0
chr7 27120354 27120430 ENm010Block108 76 0
chr7 27120431 27120734 ENm010Block109 303 0
chr7 27120735 27120784 ENm010Block110 49 0
chr7 27120785 27121113 ENm010Block111 328 0
chr7 27121114 27121886 ENm010Block112 772 0
chr7 27121887 27121912 ENm010Block113 25 0
chr7 27121950 27122139 ENm010Block114 189 37
chr7 27122140 27122368 ENm010Block115 228 0
chr7 27122369 27122596 ENm010Block116 227 0
chr7 27123470 27123811 ENm010Block117 341 873
chr7 27123812 27124306 ENm010Block118 494 0
chr7 27124307 27125180 ENm010Block119 873 0
chr7 27126966 27127320 ENm010Block120 354 1785
chr7 27127612 27127725 ENm010Block121 113 291
chr7 27127726 27128410 ENm010Block122 684 0
chr7 27128411 27129055 ENm010Block123 644 0
chr7 27129056 27129182 ENm010Block124 126 0
chr7 27129183 27129550 ENm010Block125 367 0
chr7 27130006 27130043 ENm010Block126 37 455
chr7 27130044 27130880 ENm010Block127 836 0
chr7 27130881 27131260 ENm010Block128 379 0
chr7 27135440 27135630 ENm010Block129 190 4179
chr7 27136554 27136807 ENm010Block130 253 923
chr7 27136808 27136820 ENm010Block131 12 0
chr7 27136821 27136845 ENm010Block132 24 0
chr7 27136846 27136895 ENm010Block133 49 0
chr7 27136896 27137035 ENm010Block134 139 0
chr7 27137036 27137071 ENm010Block135 35 0
chr7 27137072 27137237 ENm010Block136 165 0
chr7 27137238 27137580 ENm010Block137 342 0
chr7 27137581 27137618 ENm010Block138 37 0
chr7 27137619 27137796 ENm010Block139 177 0

Desired OUTPUT:

chr7 27104483 27105154 ENm010Cluster1
chr7 27106872 27110789 ENm010Cluster2
chr7 27111956 27112830 ENm010Cluster3
chr7 27114388 27125180 ENm010Cluster4
chr7 27126966 27131260 ENm010Cluster5
chr7 27135440 27137796 ENm010Cluster6

What I have:

chr7 27104483 27105154
chr7 27106872 27110789
chr7 27111956 27112830
chr7 27114388 27125180
chr7 27126966 27131260
chr7 27135440 27137796

which was created by the following script:

awk '1 == NR || $NF >= 1000 {
if (c) print _, __
_ = $1 FS $2
c = 1
}
{ __ = $3}
END {
print _, __} ' tester.txt


Also, I don't think this should affect anything, but the input will eventually be a .bed file, not a .txt file, and the output should also be a .bed file.

Thanks for any help/suggestions!
# 2  
Old 07-10-2008
Use the substr() function to pull out the first 6 characters (see the awk man page for details).

print _, __, ++i should do the second part for you. _ is a strange variable name to use!
# 3  
Old 07-14-2008
Thanks! I messed around with the incrementing a bit more but it worked in the end.

Last edited by awknerd; 07-14-2008 at 05:28 PM..
# 4  
Old 07-14-2008
Code:
awk>output.bed '1 == NR || $NF >= 1000 {
  if (c) 
    print _, __ "Cluster" ++i
  _ = $1 FS $2
  c = 1
  }
{ __ = $3 FS substr($4, 1, 6) }
END {
  print _, __ "Cluster" ++i
  } ' input.bed

Quote:
Originally Posted by Annihilannic
[...] _ is a strange variable name to use!
Definitely Smilie
# 5  
Old 07-14-2008
Woops! Here is the code I used:

awk '1 == NR || $NF >= 1000 {
if (c) print _, __, ___
_ = $1 FS $2
c = 1
i+=1
}

{ __ = $3; ___ = substr($4, 1, 6)"Cluster"i }
END {
print _, __, ___} ' tester.txt



Are there any advantages to the way you did it? Or disadvantages to the way I modified your code?

Thanks!

Last edited by awknerd; 07-15-2008 at 12:27 AM..
# 6  
Old 07-15-2008
Your version doesn't work for me, however radoulov's does. Are you sure you didn't typo the $i when copying your code in somehow?
# 7  
Old 07-15-2008
Worked for me..

It works for me..I changed the code I originally posted. Maybe you will try again and see that it works.. hopefully! I just tried it again and it gave me the right output:


chr7 27104483 27105154 ENm010Cluster1
chr7 27106872 27110789 ENm010Cluster2
chr7 27111956 27112830 ENm010Cluster3
chr7 27114388 27125180 ENm010Cluster4
chr7 27126966 27131260 ENm010Cluster5
chr7 27135440 27137796 ENm010Cluster6
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Matching column then append to existing File as new column

Good evening I have the below requirements, as I am not an experts in Linux/Unix and am looking for your ideas how I can do this. I have file called file1 and file2. I need to get the second column which is text1_random_alphabets and find that in file 2, if it's exists then print the 3rd... (4 Replies)
Discussion started by: mychbears
4 Replies

2. UNIX for Advanced & Expert Users

Creating the script for updating or replacing the existing http.conf file

Hi I need some help with a task, i am an absolute newbie to any form of shell scripting and request guidance. I have been building a proxy server using the apache mod proxy currently my solution is working , but i need to automate the process , suppose if any changes need to be made on... (0 Replies)
Discussion started by: satej
0 Replies

3. Shell Programming and Scripting

Need help in creating a file in required format form another existing file

I have a text file with contents like this: a,b,c, d~e,f,g,h~i,j ,k,l,m~n,o,p,q~ I need to convert this file into this format unix shell script commands: a,b,c,d~ e,f,g,h~ i,j,k,l,m~ n,o,p,q~ as you may have noticed, I need to retain the ~ signs at the end. Any help is greatly... (3 Replies)
Discussion started by: harsha1238
3 Replies

4. Shell Programming and Scripting

Creating file from an existing file using CUT, is it the best option?

Dear All, I have a requirement in which i have to load a file placed in FTP location onto my database. The process i'll follow is as below: 1) Get the files using FTP. 2) Create the desired load files as i have to load only 19 fields out of the 104 available in the file. The fields i require... (7 Replies)
Discussion started by: abhishekakaomi
7 Replies

5. Shell Programming and Scripting

Creating a new file based on existing file

Hello Guys , I need an another help regarding the below problem. I want to create a new file based on the existing file ,where two columns will be changed according to user input .(say column 4 and column 5) Please let me know how to proceed with Thanks (3 Replies)
Discussion started by: Pratik4891
3 Replies

6. UNIX for Dummies Questions & Answers

Creating a txt file

Hi All, I would like to know if you can set up in part of a script for the script to create a txt file and put some predetermined text into it. #Example make file example.txt #Then it would the lines below into the text file as text. Line 1 Line 2 Line 3 Does anyone have any... (1 Reply)
Discussion started by: outthere_3
1 Replies

7. Shell Programming and Scripting

Creating a csv file based on Existing file

Hi I am Newbie to Unix.Appreciate Help from forum user would loada b.Csv File(Below example) in /data/m/ directory.Program need to read the b.csc to extract certain column and create a new file /data/d/ directory as csv file with new name. User File Format 1232,samshouston,12345... (3 Replies)
Discussion started by: skywayterrace
3 Replies
Login or Register to Ask a Question