help: Awk to control number of characters per line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting help: Awk to control number of characters per line
# 8  
Old 06-12-2010
Hi.

I think Franklin52's code is more compact than this. This post shows the results and compares with your desired results:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate adjust count of fields according to digit count.

# Uncomment to run script as external user.
# export PATH="/usr/local/bin:/usr/bin:/bin"
# Infrastructure details, environment, commands for forum posts. 
set +o nounset
pe() { for i;do printf "%s" "$i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe ; pe "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
pe "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p specimen awk cmp diff
set -o nounset

FILE1=${1-data1}
shift
FILE2=${1-data2}

# Sample data files with head / tail if specimen fails.
pe
specimen $FILE1 $FILE2 \
|| { pe "(head/tail)"; head -n 5 $FILE1 $FILE2; pe " ||" ;\
     tail -n 5 $FILE1 $FILE2; }

pl " Results:"
awk -v f2="$FILE2" '
# first lines of pairs: print, skip
NR % 2 != 0 { print ; getline unused < f2 ; next }
            { digits = split($0,junk,"") - 1
              # print " Found", digits, " fields in line", NR
              getline < f2
              for (i = 1 ; i <= digits-1 ; i++) {
                printf("%s%s",$i,FS)
              }
              printf("%s",$digits)
              printf("\n")
            }
' $FILE1 |
tee t1

# Check results.

pl " Comparison with desired results:"
if cmp expected-output.txt t1
then
  pe " Passed -- files have same content."
else
  pe " Failed -- files differ -- details:"
  diff expected-output.txt t1
fi

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
specimen (local) 1.17
GNU Awk 3.1.5
cmp (GNU diffutils) 2.8.1
diff (GNU diffutils) 2.8.1

Whole: 5:0:5 of 6 lines in file "data1"
>1_19_130_F3
T01220131330230213311013000000110000
>1_23_69_F3
T01200211300200200010000001000000
>1_24_124_F3
T010203113002002111111200002010

Whole: 5:0:5 of 6 lines in file "data2"
>1_19_130_F3
24 18 9 18 23 4 11 4 5 9 5 8 15 20 4 4 7 4 7 4 4 4 4 4 4 4 4 7 4 4 4 4 8 4 9 
>1_23_69_F3
26 4 7 4 4 17 5 23 4 4 5 6 5 5 4 4 4 5 4 4 4 4 4 4 4 4 4 8 4 4 4 4 7 4 4 
>1_24_124_F3
32 27 24 18 29 22 23 17 18 19 24 19 15 29 12 9 16 6 26 4 4 4 4 4 4 4 4 7 4 4 4 12 10 4 5

-----
 Results:
>1_19_130_F3
24 18 9 18 23 4 11 4 5 9 5 8 15 20 4 4 7 4 7 4 4 4 4 4 4 4 4 7 4 4 4 4 8 4 9
>1_23_69_F3
26 4 7 4 4 17 5 23 4 4 5 6 5 5 4 4 4 5 4 4 4 4 4 4 4 4 4 8 4 4 4 4
>1_24_124_F3
32 27 24 18 29 22 23 17 18 19 24 19 15 29 12 9 16 6 26 4 4 4 4 4 4 4 4 7 4 4

-----
 Comparison with desired results:
 Passed -- files have same content.

The core awk script processes pairs of lines sequentially, one pair at a time. It does not keep any extra data in memory. The line from the "control" file is broken into single-character strings, but only the count is important. The main data file is read and that count of fields is written.

Best wishes ... cheers, drl
# 9  
Old 06-12-2010
Thanks for your replies. Using
Code:
  awk 'NR==FNR{T[NR]=length($1);next} 
      {if (FNR%2) {print $0} 
      else {{for (i=1;i<T[FNR];i++) printf "%s ",$i}; printf "\n"}
      }' file1 file2

with small files works, but the output is identical to file2.

Using

Code:
awk '{
  getline line < "file1"
  gsub("[A-Za-z]","",line)
  n=length(line)
  $(n+1)="_"
  sub(" _.*","")
}1' file2

yields a syntax error at
Code:
n=length(line)

and
Code:
$(n+1)="_"

at the "=" signs.
# 10  
Old 06-14-2010
Quote:
Originally Posted by DerSeb
Using

Code:
awk '{
  getline line < "file1"
  gsub("[A-Za-z]","",line)
  n=length(line)
  $(n+1)="_"
  sub(" _.*","")
}1' file2

yields a syntax error at
Code:
n=length(line)

and
Code:
$(n+1)="_"

at the "=" signs.
Try nawk or /usr/xpg4/bin/awk on Solaris.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print column number while ignoring alpha characters

I have the following script that will print column 4 ("25") when column 1 contains "123". However, I need to ignore the alpha characters that are contained in the input file. If I were to ignore the characters my output would be column 3. What is the best way to print my column of interest... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

2. Shell Programming and Scripting

awk to find number in a field then print the line and the number

Hi I want to use awk to match where field 3 contains a number within string - then print the line and just the number as a new field. The source file is pipe delimited and looks something like 1|net|ABC Letr1|1530||| 1|net|EXP_1040 ABC|1121||| 1|net|EXP_TG1224|1122||| 1|net|R_North|1123|||... (5 Replies)
Discussion started by: Mudshark
5 Replies

3. Shell Programming and Scripting

[Solved] How to separate one line to mutiple line based on certain number of characters?

hi Gurus, I need separate a file which is one huge line to multiple lines based on certain number of charactors. for example: abcdefghi high abaddffdd I want to separate the line to multiple lines for every 4 charactors. the result should be abcd efgh i hi gh a badd ffdd Thanks in... (5 Replies)
Discussion started by: ken6503
5 Replies

4. Shell Programming and Scripting

Help awk/sed: putting a space after numbers:to separate number and characters.

Hi Experts, How to sepearate the list digit with letters : with a space from where the letters begins, or other words from where the digits ended. file 52087mo(enbatl) 52049mo(enbatl) 52085mo(enbatl) 25051mo(enbatl) The output should be looks like: 52087 mo(enbatl) 52049... (10 Replies)
Discussion started by: rveri
10 Replies

5. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

6. Shell Programming and Scripting

Limit on Number of characters in a line - Vi editor

In the vi editor, there seems to be some limit on the number of characters could be allowed in single line. I tried a line with characters up to 1880. It worked. But when i tried with something of 5000 characters, it doesnt work. Any suggestions. Thanks in advance! (2 Replies)
Discussion started by: nram_krishna@ya
2 Replies

7. UNIX for Dummies Questions & Answers

AWK - number of specified characters in a string

Hello, I'm new to using AWK and would be grateful for some basic advice to get me started. I have a file consisting of 10 fields. Initially I wish to calculate the number of . , ~ and ^ characters in the 9th field ($9) of each line. This particular string also contains alphabetical... (6 Replies)
Discussion started by: Olly
6 Replies

8. Shell Programming and Scripting

Awk to extract lines with a defined number of characters

This is my problem, my file (file A) contains the following information: Now, I would like to create a file (file B) containing only the lines with 10 or more characters but less than 20 with their corresponding ID: Then, I need to compare the entries and determine their frequency. Thus, I... (7 Replies)
Discussion started by: Xterra
7 Replies

9. UNIX for Dummies Questions & Answers

Inserting control characters at the end of each line

How to add control characters at the end of each line in a file? Can anyone help me with this? Thanks, Shobana (2 Replies)
Discussion started by: Shobana_s
2 Replies

10. Shell Programming and Scripting

Maximum number of characters in a line.

Hi, Could any one please let me know what is the maximum number of characters that will fit into a single line of a flat file on a unix. Thanks. (1 Reply)
Discussion started by: Shivdatta
1 Replies
Login or Register to Ask a Question