help: Awk to control number of characters per line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting help: Awk to control number of characters per line
# 1  
Old 06-10-2010
CPU & Memory help: Awk to control number of characters per line

Hello all,

I have the following problem:

My input is two sorted files:

file1
Code:
>1_19_130_F3
T01220131330230213311013000000110000
>1_23_69_F3
T01200211300200200010000001000000
>1_24_124_F3
T010203113002002111111200002010

file2
Code:
>1_19_130_F3
24 18 9 18 23 4 11 4 5 9 5 8 15 20 4 4 7 4 7 4 4 4 4 4 4 4 4 7 4 4 4 4 8 4 9 
>1_23_69_F3
26 4 7 4 4 17 5 23 4 4 5 6 5 5 4 4 4 5 4 4 4 4 4 4 4 4 4 8 4 4 4 4 7 4 4 
>1_24_124_F3
32 27 24 18 29 22 23 17 18 19 24 19 15 29 12 9 16 6 26 4 4 4 4 4 4 4 4 7 4 4 4 12 10 4 5

Now I want to create a file similar to file 2, but with the same amount of fields than the respective line has numbers in file 1:
Output:
Code:
>1_19_130_F3
24 18 9 18 23 4 11 4 5 9 5 8 15 20 4 4 7 4 7 4 4 4 4 4 4 4 4 7 4 4 4 4 8 4 9 
>1_23_69_F3
26 4 7 4 4 17 5 23 4 4 5 6 5 5 4 4 4 5 4 4 4 4 4 4 4 4 4 8 4 4 4 4 
>1_24_124_F3
32 27 24 18 29 22 23 17 18 19 24 19 15 29 12 9 16 6 26 4 4 4 4 4 4 4 4 7 4 4

I'm pretty sure there must be an easy solution to this, but I h can't figure it out yet. Do you have any idea how to do this with awk?

Thanks for your help,
Seb

edit: typo
# 2  
Old 06-10-2010
Can you explain clearly for your request?

Code:
Now I want to create a file similar to file 2, but with the same amount of fields than the respective line has numbers in file 1:

# 3  
Old 06-10-2010
Sure, My input is file1 and file2 above.

They are both sorted and have the same number of lines.

Now I want to create an output-file that is the same as file2, but has the same amount of fields than the same line has integers in file1.

Basicly if line "n" has 5 integer values in file 1:
Code:
T01210

I want to change the line "n" in file2 from:

Code:
26 23 54 4 22 4 6 3 6 8 66

to

Code:
26 23 54 4 22

btw. in file1 the line "n" always has a T as first character and is then followed by integer values.

I hope this helps!

Last edited by DerSeb; 06-10-2010 at 10:00 AM..
# 4  
Old 06-10-2010
Code:
 awk 'NR==FNR{T[NR]=length($1);next} 
      {if (FNR%2) {print $0} 
      else {{for (i=1;i<T[FNR];i++) printf "%s ",$i}; printf "\n"}
      }' file1 file2


Last edited by rdcwayx; 06-10-2010 at 10:35 PM..
This User Gave Thanks to rdcwayx For This Post:
# 5  
Old 06-11-2010
Wow, thanks for the quick response and your program.

However, I still have some problem. First it seems to take a lot of memory (Input files are about 4 Gb each), but I got it running assigning enough memory to the Job.
However, it then crashed by saying, "Wrong placed ()."

May that be caused by me submitting this task to a SGE cluster?

Thanks again!,
Sebastian
# 6  
Old 06-12-2010
you can split the files to small size first.
# 7  
Old 06-12-2010
If you have problem with the used memory you can try this:
Code:
awk '{
  getline line < "file1"
  gsub("[A-Za-z]","",line)
  n=length(line)
  $(n+1)="_"
  sub(" _.*","")
}1' file2

This User Gave Thanks to Franklin52 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print column number while ignoring alpha characters

I have the following script that will print column 4 ("25") when column 1 contains "123". However, I need to ignore the alpha characters that are contained in the input file. If I were to ignore the characters my output would be column 3. What is the best way to print my column of interest... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

2. Shell Programming and Scripting

awk to find number in a field then print the line and the number

Hi I want to use awk to match where field 3 contains a number within string - then print the line and just the number as a new field. The source file is pipe delimited and looks something like 1|net|ABC Letr1|1530||| 1|net|EXP_1040 ABC|1121||| 1|net|EXP_TG1224|1122||| 1|net|R_North|1123|||... (5 Replies)
Discussion started by: Mudshark
5 Replies

3. Shell Programming and Scripting

[Solved] How to separate one line to mutiple line based on certain number of characters?

hi Gurus, I need separate a file which is one huge line to multiple lines based on certain number of charactors. for example: abcdefghi high abaddffdd I want to separate the line to multiple lines for every 4 charactors. the result should be abcd efgh i hi gh a badd ffdd Thanks in... (5 Replies)
Discussion started by: ken6503
5 Replies

4. Shell Programming and Scripting

Help awk/sed: putting a space after numbers:to separate number and characters.

Hi Experts, How to sepearate the list digit with letters : with a space from where the letters begins, or other words from where the digits ended. file 52087mo(enbatl) 52049mo(enbatl) 52085mo(enbatl) 25051mo(enbatl) The output should be looks like: 52087 mo(enbatl) 52049... (10 Replies)
Discussion started by: rveri
10 Replies

5. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

6. Shell Programming and Scripting

Limit on Number of characters in a line - Vi editor

In the vi editor, there seems to be some limit on the number of characters could be allowed in single line. I tried a line with characters up to 1880. It worked. But when i tried with something of 5000 characters, it doesnt work. Any suggestions. Thanks in advance! (2 Replies)
Discussion started by: nram_krishna@ya
2 Replies

7. UNIX for Dummies Questions & Answers

AWK - number of specified characters in a string

Hello, I'm new to using AWK and would be grateful for some basic advice to get me started. I have a file consisting of 10 fields. Initially I wish to calculate the number of . , ~ and ^ characters in the 9th field ($9) of each line. This particular string also contains alphabetical... (6 Replies)
Discussion started by: Olly
6 Replies

8. Shell Programming and Scripting

Awk to extract lines with a defined number of characters

This is my problem, my file (file A) contains the following information: Now, I would like to create a file (file B) containing only the lines with 10 or more characters but less than 20 with their corresponding ID: Then, I need to compare the entries and determine their frequency. Thus, I... (7 Replies)
Discussion started by: Xterra
7 Replies

9. UNIX for Dummies Questions & Answers

Inserting control characters at the end of each line

How to add control characters at the end of each line in a file? Can anyone help me with this? Thanks, Shobana (2 Replies)
Discussion started by: Shobana_s
2 Replies

10. Shell Programming and Scripting

Maximum number of characters in a line.

Hi, Could any one please let me know what is the maximum number of characters that will fit into a single line of a flat file on a unix. Thanks. (1 Reply)
Discussion started by: Shivdatta
1 Replies
Login or Register to Ask a Question