Sorting and merging files.


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Sorting and merging files.
# 1  
Old 09-21-2011
Sorting and merging files.

Hi
I’m new to scripting and have only had about two days experience with this. I have questions about a bash/gawk script.

Problem:
I have 27 files, which needs to get merged into one, the files are separated into 8 subdivisions containing a 3 row data description. Example of data

File.1
Code:
3-line header – corresponding to location and case/ control number
  149999 rows of (numeric) data
  3-line header 
  149999 rows 
  etc.

File. 2-26
Code:
3-line header
  144999 rows data rows 
  etc.

file 27
Code:
3 line header 
  130140 rows 
  etc

My start :
Code:
   
  (S1=1
  C1=150003
  S2=4
  C2=145003
  C3=130140)
  gawk 'NR>=1 && NR<=150003  {print}' coverage1.wig >>Merge_cov
  echo " process file numer : 1"
  for i in {2..26} 
  do
  gawk 'NR>=4 && NR<=145003  {print}' coverage$i.wig >> Merge_cov
  echo "process file numer : "$i
  done
  gawk 'NR>=4 && NR<=130144  {print}' coverage27.wig >>Merge_cov
  echo "process file numer : 27"
  echo "Het.1 sorted"

I want to make a bash loop or gawk loop processing the other 7 subsets of data, but I have had some problem with assigning values in bash script, which could not be read into gawk.

Other useful conditions maybe that the data lines are only numeric and the headers are text .


Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by radoulov; 09-22-2011 at 11:58 AM.. Reason: please use code tags for your code and data! Thanks
# 2  
Old 09-21-2011
You can use sed:
Code:
mFrom=10
mTo=15
sed -n "${mFrom},${mTo}p" File

# 3  
Old 09-21-2011
You can feed variables into awk with -v VARNAME="${VAR}" instead of trying to embed them in the awk script itself.
# 4  
Old 09-22-2011
Say if I have this script, it will do the job buts its really ugly, I need an array of the search elements [hH]et1 [hH]et2 [hH]et3 [hH]et4 etc which i can feed into a loop.

Is there a smarter way instead of reading each file of the 27 files 8 times, just to read them once and sort the output in a smart way? Currently Im just feeding in the order I want it.

Thanks in advance for helping a beginnerSmilie


Code:
#!bin/sh

awk '/[Hh]et1/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[hH]et1/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done 

awk '/[Hh]et2/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[hH]et2/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[Hh]et3/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199931-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]et3/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[hH]et4/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]et4/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[Hh]omo1/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]omo1/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[Hh]omo2/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]omo2/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[Hh]omo3/,/browser/' coverage1.txt | sed '-e /track/i\browser position chr13:49199931-53100049' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]omo3/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

awk '/[Hh]omo4/,/browser/' coverage1.txt| sed '-e /track/i\browser position chr13:49199927-53100067' -e '$d' >> out
for i in {2..27}
do
awk '/[Hh]omo4/,/browser/' coverage$i.txt | sed '1,2d;$d' >> out
done

grep Relative -A 150000 coverage1.txt | sed '-e /track/i\browser position chr13:49199927-53100067'  >> out
for i in {2..26}
do
grep Relative -A 145000 coverage$i.txt | sed '1,2d' >> out
done
grep Relative -A 130140 coverage27.txt | sed '1,2d' >> out

Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by radoulov; 09-22-2011 at 11:58 AM..
# 5  
Old 09-22-2011
If you display a sample of the input and expected output, you would have a lot more responses.
# 6  
Old 09-22-2011
Ok no problem

browser position chr13:49199927-49349926
track type=wiggle_0 name="het1 " description="Coverage" maxHeightPixels=100:50:20 visibility=full autoScale=off viewLimits=0.0:377 color=0,0,0 yLineOnOff=on priority=10
fixedStep chrom=chr13 start=49199927 step=1 span=1
1
1
1
2
2
3
3
3
4
etc - 150000 rows
browser position chr13:49199927-49349926
track type=wiggle_0 name="het2 " description="Coverage" maxHeightPixels=100:50:20 visibility=full autoScale=off viewLimits=0.0:377 color=0,0,0 yLineOnOff=on priority=10
fixedStep chrom=chr13 start=49199927 step=1 span=1
1
20
20
21
21
22
24
25
27
29
etc similar pattern for the 8 samples, last one sas Relative coverage and just ends with the last numbers that why i called it by row number and not by awk /pattern /,pattern/

I just want to merge all files into one change the header so its corresponds to correct range.
# 7  
Old 09-22-2011
Will this solve your problem:
Code:
 sed -n '/^[0-9]/p' All_27_files

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging and sorting files

I have the following files: file A Col1 Col2 A 1 B 2 C 3 D 4 file B Col1 Col2 A 1 Aa 1 B 2 C 3 D 4 file C Col1 Col2 A 1 (1 Reply)
Discussion started by: ramky79
1 Replies

2. Shell Programming and Scripting

Strange situation of file sorting and merging

I have a strange situation of sorting and merging two files based on similar columns previusly both files has same count of records so, I made below way which is working fine until they reduced the count of one files . I.e. some times the count of records of both will same and some times it... (16 Replies)
Discussion started by: manas_ranjan
16 Replies

3. Shell Programming and Scripting

merging two files

file1.txt 1 2 10 11 56 57 7 8 43 44 and let's suppose that there is a file called file2.txt with 100 columns I want to produce a file3.txt with columns specified in file1.txt in that order (1,2,10,11,56,57,7,8,43,44) Thanks! (2 Replies)
Discussion started by: johnkim0806
2 Replies

4. Programming

Help in sorting and merging lists

Hi everyone, need your help in sorting and merging two numerical lists Example: I have one list 1 2 3 4 5 7 and the other 4 6 8, then the final output should be 1 2 3 4 5 6 7 8 Requesting your kind help in this Regards, RB :) (1 Reply)
Discussion started by: ramakanth_burra
1 Replies

5. Shell Programming and Scripting

Merging two files with same name

Hello all, I have limited experience in shell scripting. Here goes my question: I have two directories that have same number of files with same file names i.e. consider 2 directories A and B. Both directories have files 1.txt, 2.txt...... I need to merge the file 1.txt of A with file 1.txt... (5 Replies)
Discussion started by: jaysean
5 Replies

6. Shell Programming and Scripting

merging of files.

Hi, I want to merge the two files on the basis of columns like... file 1 Data Key A 12 B 13 file2 Data Value A A1 A A2 B B1 B B2 (5 Replies)
Discussion started by: clx
5 Replies

7. Shell Programming and Scripting

merging two files

Hi everyone, I have two files which will be exactly same at first. After sometime there will be inserts in one file. My problem is how to reflect these changes in second file also. I found out that any compare and merge utility would do the job like, GNU " sdiff " command. But the... (14 Replies)
Discussion started by: rameshonline
14 Replies

8. Shell Programming and Scripting

Merging 2 files

Hi, I have got two files 1.txt 1111|apple| 2222|orange| 2.txt 1111|1234|000000000004356| 1111|1234|000000001111| 1111|1234|002000011112| 2222|5678|000000002222| 2222|9102|000000002222| I need to merge these two so that my out put looks like below: Search code being used should be... (4 Replies)
Discussion started by: jisha
4 Replies

9. Shell Programming and Scripting

merging two files

Friends, os: redhat enterprise linux/SCO UNIX5.0 I have two files and I would like to merge on given key value. Now I have tried with join commd but it does not supporte multiple delimiters. and if records length is not fixed. join -a1 5 -a2 1 -t -o file1 file2 > outname Can any... (7 Replies)
Discussion started by: vakharia Mahesh
7 Replies

10. Shell Programming and Scripting

merging files

Thanks in advance I have 2 files having key field in each.I would like to join both on common key.I have used join but not sucessful. The files are attached here . what i Want in the output is on the key field SLS OFFR . I have used join commd but not successful. File one ======= SNO ... (6 Replies)
Discussion started by: vakharia Mahesh
6 Replies
Login or Register to Ask a Question