help with file formatting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting help with file formatting
# 8  
Old 09-18-2009
Code:
 nawk 'BEGIN{FS=RS="";OFS=","} $1=$1' myFile



---------- Post updated at 06:49 AM ---------- Previous update was at 06:48 AM ----------

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags [code] and [/code] by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums
# 9  
Old 09-18-2009
The solution of vidyadhar85 gives:

Code:
$ awk 'ORS=(NF)?",":"\n"' file
AAA,pqr,jkl,mnop,abcd,
BBB,abc,pqrs,xyz,uvw,,efgh,uvw,,rpk,
CCC,123,456,789,$

The solution of summer_cherry gives:

Code:
$ sed -n '/^$/!{
      $!{H;}
      ${H;x;s/\n/,/2;s/\n//g;p;}
      }
    /^$/{x;s/\n/,/2;s/\n//g;p;}' file
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789
$

The solution of radoulov gives:

Code:
$ awk -F, 'END { print r }
NR > 1 && /[A-Z]/ {
  print r; r = ""
  }
{ r = r ? r $0 : $0 FS }
' file
AAA,
pqr,jkl,mnop,abcd,
BBB,
abc,pqrs,xyz,uvw,,
efgh,uvw,,
rpk,
CCC,123,456,789
$

The solution of vgersh99 gives:

Code:
$ nawk 'BEGIN{FS=RS="";OFS=","} $1=$1' file
A,A,A,
,p,q,r,,,j,k,l,,,m,n,o,p,,,a,b,c,d
B,B,B,
,a,b,c,,,p,q,r,s,,,x,y,z,,,u,v,w,,,
,e,f,g,h,,,u,v,w,,,
,r,p,k
C,C,C,
,1,2,3,,,4,5,6,,,7,8,9
$


The expected output is:

Code:
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789

Only the solution of summer_cherry gives me the right output. Am I missing something?

My approach:

Code:
awk '{$1=$1;gsub(",,",",")}1' OFS="," RS="\n\n" file

Output:

Code:
$ awk '{$1=$1;gsub(",,",",")}1' OFS="," RS="\n\n" file
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789
$

Regards
# 10  
Old 09-18-2009
I get the following output:
Code:
$ nawk 'BEGIN{FS=RS="";OFS=","} $1=$1' pr.txt
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,,efgh,uvw,,rpk
CCC,123,456,789

Sure, not exactly what the OP wanted. Here's another version:
Code:
nawk 'BEGIN{FS=RS="";OFS=","} {for(i=1;i<=NF;i++) sub(",$", "", $i)}$1=$1' pr.txt
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789

Franklin52, the potential issue with your code is that you assume that there're no embedded 'empty' fields in any of the lines: ',,':
Code:
BBB
abc,pqrs,,xyz,uvw,
efgh,,uvw,
rpk

# 11  
Old 09-18-2009
Franklin52's post is important. What version of awk using my code produces the output you mentioned?

Code:
% nawk --version
awk version 20070501
% nawk -F, 'END { print r }
NR > 1 && /[A-Z]/ {
  print r; r = ""
  }
{ r = r ? r $0 : $0 FS }
' infile        
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789
% gawk --version |head -1
GNU Awk 3.1.7
% gawk -F, 'END { print r }
NR > 1 && /[A-Z]/ {
  print r; r = ""
  }
{ r = r ? r $0 : $0 FS }
' infile        
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789

On Solaris:

Code:
$ nawk -F, 'END { print r }
> NR > 1 && /[A-Z]/ {
>   print r; r = ""
>   }
> { r = r ? r $0 : $0 FS }
> ' infile
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789
$ /usr/xpg4/bin/awk -F, 'END { print r }
NR > 1 && /[A-Z]/ {
  print r; r = ""
  }
{ r = r ? r $0 : $0 FS }
' infile
AAA,pqr,jkl,mnop,abcd
BBB,abc,pqrs,xyz,uvw,efgh,uvw,rpk
CCC,123,456,789

Input file used:

Code:
AAA
pqr,jkl,mnop,abcd

BBB
abc,pqrs,xyz,uvw,
efgh,uvw,
rpk

CCC
123,456,789



---------- Post updated at 02:56 PM ---------- Previous update was at 02:53 PM ----------

Franklin52's code assumes an awk implementation that supports multi character record separator (RS). So it will work only with GNU awk or tawk (?) I suppose.

---------- Post updated at 02:59 PM ---------- Previous update was at 02:56 PM ----------

Franklin52 , could you please post the sample data used in your examples? Is it different from the OP example?

---------- Post updated at 03:09 PM ---------- Previous update was at 02:59 PM ----------

vgersh99's solutions are old nawk specific, because setting FS to an empty string has a different meaning in the other awk implementations (not sure about mawk and tawk, though).
# 12  
Old 09-18-2009
Strange, I am using GNU Awk 3.1.5 and I use the same input file as the OP.... Smilie.

Regards
# 13  
Old 09-18-2009
Quote:
Originally Posted by Franklin52
Strange, I am using GNU Awk 3.1.5 and I use the same input file as the OP.... Smilie.

Regards
Thank you very much for the info. It's because of the locale (you're using UTF-8 or similar) Smilie
The correct command should be:

Code:
LANG=C awk -F, 'END { print r }
NR > 1 && /[A-Z]/ { 
  print r; r = "" 
  }
{ r = r ? r $0 : $0 FS }
' infile

The problematic part is [A-Z].

---------- Post updated at 03:30 PM ---------- Previous update was at 03:27 PM ----------

Another workaround is to use [[:upper:]] (if supported) instead of [A-Z].
# 14  
Old 09-18-2009
Only summer_cherry and radoulov solution work as expected on my OS.
Code:
# /usr/bin/awk --version
awk version 20070501 (FreeBSD)

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Formatting data in a raw file by using another mapping file

Hi All, i have a requirement where i need to format the input RAW file ( which is CSV) by using another mapping file(also CSV file). basically i am getting feed file with dynamic headers by using mapping file (in that target field is mapped with source filed) i have to convert the raw file into... (6 Replies)
Discussion started by: ravi4informatic
6 Replies

2. Shell Programming and Scripting

Formatting file data to another file (control character related)

I have to write a program to read data from files and then format into another file. However, I face a strange problem related to control character that I can't understand and solve. The source file is compose of many lines with such format: T_NAME|P_NAME|P_CODE|DOCUMENT_PATH|REG_DATE ... (3 Replies)
Discussion started by: hk6279
3 Replies

3. Shell Programming and Scripting

File Formatting

Hi, I have requirement to format the file.My input file tab(\t) saperated. File format is:- 93 WARNING Sat Mar 17 20:31:59 2012 Sequential_File_0,0: Missing record delimiter "\r\n", saw EOF instead 94 WARNING Sat Mar 17 20:31:59 2012 Sequential_File_0,0: Import... (4 Replies)
Discussion started by: prasson_ibm
4 Replies

4. Shell Programming and Scripting

File formatting

Hi, I have a file which contains data in this format # User@Host: abc @ Id: 0000000 # Query_time: 0.000070 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 0 SET timestamp=00000000; SELECT @@version, @@version_comment; # User@Host: abcd @ Id: 00000000 # Query_time: 0.000228 ... (6 Replies)
Discussion started by: arijitsaha
6 Replies

5. Shell Programming and Scripting

File formatting

I need to count the number of lines between two sets of pattern in a file and delete those lines from that file e.g From jyotiv@yahoo.com test test2 test3 test4 test5 test6 From Jyotiv@yahoo.com So count lines from test to test6 and delete it from the start of file till next From... (1 Reply)
Discussion started by: jyotiv
1 Replies

6. Shell Programming and Scripting

File Formatting

Hi, Need to delete all the records prior to pattern (INSERT/UPDATE/DELETE). If ' is available, then need to retain it. Input ====================== l_s := ' INSERT INTO TEST' l_P PD := ' UPDATE INTO TEST' l_D := ' DELETE INTO TEST' This is test Output ======================... (1 Reply)
Discussion started by: saurabhbaisakhi
1 Replies

7. Shell Programming and Scripting

File Formatting

Hi, Need to delete all the records prior to pattern (INSERT/UPDATE/DELETE). If ' is available, then need to retain it. Input ====================== l_s := ' INSERT INTO TEST' l_P PD := ' UPDATE INTO TEST' l_D := ' DELETE INTO TEST' This is test Input ======================... (1 Reply)
Discussion started by: saurabhbaisakhi
1 Replies

8. UNIX for Dummies Questions & Answers

Formatting a file.

I want to format a file to limit record length = 100, in each line. Any idea how i can do this? (1 Reply)
Discussion started by: abhilasha
1 Replies

9. Shell Programming and Scripting

Help with formatting of file.

I have a file with following file format - DMCRH|||83000171|||14022008||0430|||8956612.23|J|||3571235|1378452|23468|6894|9234| DMCRH|||83000215|||15092007||0480|||121.33|J|||LineID003|RefNumSP003|RefNumMem003|0004|0003| What i need done is - 1. Cut the firt four digits of the date (eg 1402... (3 Replies)
Discussion started by: divz
3 Replies

10. Shell Programming and Scripting

Formatting a file

Hi All, I have been trying to format a file using sed. I can't seem to get the syntax right. I want to append the next line delemited by a comma or a comma and double quotes. Here is an example of the file I'm tring to format: Before formatting: 00324 03A0312 BRI-u24 0000324 01 H-12... (4 Replies)
Discussion started by: cstovall
4 Replies
Login or Register to Ask a Question