Add a new column to txt file containing filename


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Add a new column to txt file containing filename
# 1  
Old 03-21-2012
Add a new column to txt file containing filename

I would like help adding a new column to a large txt file (~10MB) that contains the filename. I have searched other posts but have not found an adequate solution.

I need this extra column so I can concatenate >100 files and perform awk searches on this large file.

My current txt file look like this

Code:
#chr_name       chr_start       chr_end ref_base        alt_base        hom_het snp_quality     tot_depth       alt_depth       region  gene    change  annotation      dbSNP135_full   dbSNP135_common 1000G_2010Nov_allele_freq       1000G_2011Oct_allele_freq       SCS     CLN     OMIM
chr01   14542   14542   A       G       hom     4.11    3       3       ncRNA_exonic    WASH7P  .       .       rs1045833       .       .       .       .       .       .
chr01   14574   14574   A       G       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs28503599      .       .       .       .       .       .
chr01   14590   14590   G       A       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs707679        .       .       .       .       .       .
chr01   14599   14599   T       A       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs707680        .       .       .       .       .       .
chr01   14604   14604   A       G       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs4021621       .       .       .       .       .       .
chr01   14610   14610   T       C       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs79134172      .       .       .       .       .       .
chr01   14699   14699   C       G       het     3.54    11      3       ncRNA_exonic    WASH7P  .       .       rs11490464      .       0.02    .       .       .       .
chr01   14907   14907   A       G       hom     184     28      27      ncRNA_intronic  WASH7P  .       .       rs6682375       .       .       .       .       .       .
chr01   14930   14930   A       G       hom     207     45      42      ncRNA_intronic  WASH7P  .       .       rs6682385       rs75454623      0.71    .       .       .       .
chr01   14933   14933   G       A       het     113     46      21      ncRNA_intronic  WASH7P  .       .       .       .       0.057   .       .       .       .
chr01   14976   14976   G       A       het     140     47      20      ncRNA_exonic    WASH7P  .       .       rs71252251      .       .       .       .       .       .
chr01   15118   15118   A       G       het     17.4    8       5       ncRNA_intronic  WASH7P  .       .       rs11580262      .       0.041   .       .       .       .
chr01   17701   17701   C       T       het     30      23      9       ncRNA_exonic    WASH7P  .       .       .       .       .       .       .       .       .
chr01   662857  662857  G       A       hom     87      36      32      ncRNA_exonic    LOC100133331    .       .       rs6689091       .       .       .       .       .       .
chr01   663097  663097  G       C       hom     120     91      85      ncRNA_exonic    LOC100133331    .       .       rs61769340      .       0.67    .       .       .       .
chr01   761732  761732  C       T       hom     222     45      45      ncRNA_exonic    LINC00115       .       .       rs2286139       rs2286139       0.537   0.61    .       .       .
chr01   761752  761752  C       T       hom     222     26      23      ncRNA_exonic    LINC00115       .       .       rs1057213       rs1057213       0.544   0.75    .       .       .
chr01   761800  761800  A       T       hom     22      5       3       ncRNA_exonic    LINC00115       .       .       rs1064272       .       0.114   .       .       .       .
chr01   761811  761811  G       A       hom     33.5    4       4       ncRNA_exonic    LINC00115       .       .       rs1057212       .       0.59    .       .       .       .
chr01   762273  762273  G       A       hom     190     52      48      ncRNA_exonic    LINC00115       .       .       rs3115849       rs3115849       0.555   0.72    .       .       .
chr01   762589  762589  G       C       hom     207     58      53      ncRNA_exonic    LINC00115       .       .       rs3115848       rs71507461      0.73    0.72    .       .       .
chr01   762592  762592  C       G       hom     180     60      53      ncRNA_exonic    LINC00115       .       .       rs3131950       rs71507462      0.72    0.72    .       .       .
chr01   762601  762601  T       C       hom     222     60      59      ncRNA_exonic    LINC00115       .       .       rs3131949       rs71507463      0.72    0.72    .       .       .
chr01   762632  762632  T       A       hom     222     89      87      ncRNA_exonic    LINC00115       .       .       rs3131948       rs61768173      0.71    0.72    .       .       .
chr01   787262  787262  C       G       hom     87      10      10      ncRNA_intronic  LOC643837       .       .       rs2905056       rs56108613      0.668   0.78    .       .       .

I would like to add a new column (at either the start or end, but start is preferred) that contains the filename in each line.

This is what I would like the output to look like for filename=samplexxx.txt

Code:
samplexxx #chr_name       chr_start       chr_end ref_base        alt_base        hom_het snp_quality     tot_depth       alt_depth       region  gene    change  annotation      dbSNP135_full   dbSNP135_common 1000G_2010Nov_allele_freq       1000G_2011Oct_allele_freq       SCS     CLN     OMIM
samplexxx chr01   14542   14542   A       G       hom     4.11    3       3       ncRNA_exonic    WASH7P  .       .       rs1045833       .       .       .       .       .       .
samplexxx chr01   14574   14574   A       G       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs28503599      .       .       .       .       .       .
samplexxx chr01   14590   14590   G       A       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs707679        .       .       .       .       .       .
samplexxx chr01   14599   14599   T       A       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs707680        .       .       .       .       .       .
samplexxx chr01   14604   14604   A       G       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs4021621       .       .       .       .       .       .
samplexxx chr01   14610   14610   T       C       hom     20.2    6       6       ncRNA_exonic    WASH7P  .       .       rs79134172      .       .       .       .       .       .

I have >100 files that I need to do this for, so any help will be greatly appreciated. I have no desire to open each file manually in excel to do this!

Many thanks Smilie
# 2  
Old 03-22-2012
Try with some two files, and check the output and then apply it for all the files.

Code:
 
for i in *; do nawk '{print FILENAME"\t"$0}' $i > $i.bk; mv $i.k $i; done

This User Gave Thanks to itkamaraj For This Post:
# 3  
Old 03-22-2012
Hi there.
Maybe you should use sed.
Try doing something like:

Code:
sed -i.bkp 's/^/filenameU/' your_input_file.txt

Is your columns delimited by tab or spaces?
In case of tabs, and, if you're using bash, just replace the U character mentioned aboved by Ctrl-V then the TAB key. This will let you get the TAB you need.
That's it. Hope it helps.
This User Gave Thanks to chapeupreto For This Post:
# 4  
Old 03-25-2012
Quote:
Originally Posted by itkamaraj
Try with some two files, and check the output and then apply it for all the files.

Code:
 
for i in *; do nawk '{print FILENAME"\t"$0}' $i > $i.bk; mv $i.k $i; done

Worked perfectly, just need to change
Code:
mv $i.k $i

to
Code:
mv $i.bk $i

and it was great. Thank you
# 5  
Old 03-26-2012
Add a new column to txt file containing filename

Hi Kelly,

Try this too,

Code:
nawk '{a=FILENAME;}{print a"\t"$0}' yourFilename > yourFilename.bk && mv yourFilename.bk yourFilename

since you are working for single file, you can get your expected output in single command itself.

Regards,
SHANMU
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search last column of INPUT.txt in TABLEs text and add correspond columns to INPUT.txt

Hi dears i use bash shell i have INPUT.txt like this number of columns different in one some row have 12 , some 11 columns see last column INPUT.txt CodeGender Age Grade Dialect Session Sentence Start End Length Phonemic Phonetic 63 M 27 BS/BA TEHRANI 3 4 298320 310050... (2 Replies)
Discussion started by: alii
2 Replies

2. Shell Programming and Scripting

Phrase txt file row to column

Hi Guys, I have one Big txt file and i what to phrase specific part as below. Input :- Event Event { recordLength 160118, recordType 411, eventId 3102118, INTERNAL_PER_RO_ME_TA { EVVXX_TIMESTAMP_HOUR 16, EVVXX_TIMESTAMP_MINUTE 15, EVVXX_TIMESTAMP_SECOND 3, ... (6 Replies)
Discussion started by: pareshkp
6 Replies

3. Windows & DOS: Issues & Discussions

2 Questions: replace text in txt file, add text to end of txt file

so... Lets assume I have a text file. The text file contains multiple "#" symbols. I want to replace all thos "#"s with a STRING using DOS/Batch I want to add a certain TEXT to the end of each line. How can I do this WITHOUT aid of sed, grep or anything linux related ? (1 Reply)
Discussion started by: pasc
1 Replies

4. UNIX for Dummies Questions & Answers

Move txt file to with current date appended to filename

I have multiple txt files which begin with the word "orders" in folder C:\source. I need to move the files to folder C:\dest and rename them to "process_<date>_<count>" So for example , if there are 3 files ordersa.txt , ordersb.txt and ordersc.txt in C:\source , after running the script I want... (7 Replies)
Discussion started by: johannd
7 Replies

5. Shell Programming and Scripting

Move txt file to with current date appended to filename

I have multiple txt files which begin with the word "orders" in folder C:\source. I need to move the files to folder C:\dest and rename them to "process_<date>_<count>" So for example , if there are 3 files ordersa.txt , ordersb.txt and ordersc.txt in C:\source , after running the script I want... (1 Reply)
Discussion started by: johannd
1 Replies

6. UNIX for Dummies Questions & Answers

Sorting a txt file that is a single column

How do you sort a text file that is made up of a single column? (sorting done in alphabetical order) Example input: MAP1S ISYNA1 STAT6 Example output: ISYNA1 MAP1S STAT6 Double post (0 Replies)
Discussion started by: evelibertine
0 Replies

7. Shell Programming and Scripting

write filename as first line in a txt file

Could anyone very kindly help me a simple way to perform the - perhaps - very trivial task of writing the name of a file as first line of that file which is in txt format? And would be possible to do this recursively for some thousands files in the XY directory? And, again, add to the simple... (3 Replies)
Discussion started by: mjomba
3 Replies

8. AIX

Adding column in a .txt file

Helle, I want to create a .ksh script in order to realize the following : I have a .txt file organized in a bloc of information, each bloc start with 000 as following: 000... 001... 003... 004... 000... 001... 003... 004... . . My aim is to add a new... (6 Replies)
Discussion started by: zainab2006
6 Replies

9. Shell Programming and Scripting

rename multiple filename.45267.txt to >> filename.txt

i have several thousand files and in subdirs that are named file.46634.txt budget.75346.pdf etc i want to remove the number but retain the extension. it is always a 5 digit. thanks. (6 Replies)
Discussion started by: jason7
6 Replies

10. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies
Login or Register to Ask a Question