Removing cr,lf till number of fields are full


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing cr,lf till number of fields are full
# 1  
Old 12-04-2013
Removing cr,lf till number of fields are full

I have a file
Code:
1|2|3|4
a|b|c|d
1|2
3|4
a|
b|
c|
d|

The file should have 4 fields to load into a database. The file may have cr, lf, or end of line characters.

What I want to see as output is
Code:
1|2|3|4
a|b|c|d
1|23|4
a|b|c|d

I have tried

Code:
BEGIN {FS="|";break_flag = 0;field_count=4}
{
#print NF
delc=gsub(/\|/,"|",$0)
print delc
if (NF == 4 )
{
print $0
}
else if (delc != 3)
{
#gsub("\"","")
gsub(/\r/,"")
printf("%s|",$0)
}
}

What I want is that if the line does not have 4 fields, continue reading until you get 4 fields and print them without cr, lf etc only | as the field separator

Thanks,
Tim
Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 12-04-2013 at 01:39 PM.. Reason: code tags, please!
# 2  
Old 12-04-2013
What is an "end of line" character?

Another trick you can do with awk is changing the record separator -- what it considers a 'line' to be. This lets it read one field at a time instead of 1 line at a time, making it fewer decisions to pile up four records before printing.

tr can easily convert one or more \r\n or whatever into | to make things easier for it. Tell me what an "end of line character" is and it can include that too.


Code:
tr -s '\r\n' '|' < inputfile | awk '{ A=A OFS $1 ; N++ }; N>=W { print substr(A,2); A=""; N=0 } END { print substr(A,2); }' RS="|" OFS="|" W=4 -


Last edited by Corona688; 12-04-2013 at 02:04 PM..
# 3  
Old 12-04-2013
What if a partial line has e.g. 3 fields and the next line has 3 lines as well? Concatening the two lines would become 6 fields in length - would that be acceptable?
And, how did you assemble that 1|23|4 in your sample?
However, for the input sample given, you might try
Code:
awk     '               {gsub (/\r/, "")
                         gsub (/\|$/, "")
                         $0 = X (X?"|":"") $0
                        }
         NF < 4         {X = $0; next}
         1
                        {X = ""}
        ' FS="|" file


Last edited by RudiC; 12-04-2013 at 03:12 PM.. Reason: typo
# 4  
Old 12-04-2013
Thanks!

So the end of line char is $ in vi. So what I really want is I have a file that has 80 fields. In there is free form text that people have added the <enter> key so they show up as cr or line feeds. What I want is I want to read the file, if I hit cr or lf, replace with space and then continue till I hit 80 fields. After 80, that is a newline. Read the next 80 etc. Some rows are good, have all 80 fields wo cr/lf, but some are not. The goal is to make it uniform. I just tried with the data example above to try on smaller scale.

And RudiC, it should be (typo)

Code:
1|2|3|4
a|b|c|d
1|2|3|4
a|b|c|d

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 12-04-2013 at 03:25 PM.. Reason: once again - PLEASE use code tags!!!
# 5  
Old 12-04-2013
Try :

Code:
$ cat file
1|2|3|4
a|b|c|d
1|2
3|4
a|
b|
c|
d|

Code:
$ cat test.sh
awk '
    NF==4
    NF!=4{
           i += $NF !~ /[[:alnum:]]/? NF-1 : NF
           gsub(/\|$/,x)
           printf i < 4 ? $0 FS : $0 RS
           i = i == 4 ? 0 : i
         }
    ' FS="|" file

Code:
$ bash test.sh
1|2|3|4
a|b|c|d
1|2|3|4
a|b|c|d

---edit----

Corona688 brilliant !

Last edited by Akshay Hegde; 12-04-2013 at 04:05 PM..
# 6  
Old 12-04-2013
Seems to work but I think I am hitting a buffer issue with variable. It gets error ^ ran out for this one on a line and does not process further.

Thanks,
Tim
# 7  
Old 12-04-2013
Quote:
Originally Posted by tampatim
Seems to work but I think I am hitting a buffer issue with variable. It gets error ^ ran out for this one on a line and does not process further.
What gets what error? Show us exactly what you did and exactly what happened, word for word, letter for letter, keystroke for keystroke.

Did you try my code?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Is there a UNIX command that can compare fields of files with differing number of fields?

Hi, Below are the sample files. x.txt is from an Excel file that is a list of users from Windows and y.txt is a list of database account. $ head -500 x.txt y.txt ==> x.txt <== TEST01 APP_USER_PROFILE USER03 APP_USER_PROFILE TEST02 APP_USER_EXP_PROFILE TEST04 APP_USER_PROFILE USER01 ... (3 Replies)
Discussion started by: newbie_01
3 Replies

2. Shell Programming and Scripting

Printing Number of Fields with the line number

Hi, How to print the number of fields in each record with the line number? Lets saw I have 3212|shipped|received| 3213|shipped|undelivered| 3214|shipped|received|delivered I tried the code awk -F '|' '{print NF}' This gives me ouput as 3 3 4 (5 Replies)
Discussion started by: machomaddy
5 Replies

3. Shell Programming and Scripting

Get the number of lines till I get line

Hi All, I have a file as below: abc.txt ****************************** * HEADER DESCRIPTION ****************************** *Supplier: Prism Customer: MNI -NIGERIA Quantity: 2 Type: PLUG-IN Profile: 70.00 *Subscription: Generic... (5 Replies)
Discussion started by: arunshankar.c
5 Replies

4. Shell Programming and Scripting

Count the number of fields in column

Hi I was going through the below thread https://www.unix.com/shell-programming-scripting/48535-how-count-number-fields-record.html I too have something similar requirement as specified in this thread but the number of columns in my case can be very high, so I am getting following error. ... (3 Replies)
Discussion started by: shekharjchandra
3 Replies

5. Shell Programming and Scripting

Number of fields handled by awk

Hi Gurus, Have a file seperated by "~" and no of fields is 104. When i try to run awk, it erros out. awk: record `B~A31~T24_STF~~~2009...' has too many fields Any idea how can i extract a specific filed with this many fields in a row. Kindly help (3 Replies)
Discussion started by: srivat79
3 Replies

6. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

7. Shell Programming and Scripting

way to print all the string till we get a space and a number

Is there any way to print all the string till we get a space and a number and store it a variable for eg we have string java.io.IOException: An existing connection was forcibly closed by the remote host 12 All I want is to store "java.io.IOException: An existing connection was forcibly closed... (13 Replies)
Discussion started by: villain41
13 Replies

8. Shell Programming and Scripting

How to print lines till till a pattern is matched in loop

Dear All I have a file like this 112534554 446538656 444695656 225696966 226569744 228787874 113536566 443533535 222564552 115464656 225445345 225533234 I want to cut the file into different parts where the first two columns are '11' . The first two columns will be either... (3 Replies)
Discussion started by: anoopvraj
3 Replies

9. Shell Programming and Scripting

Removing LF and extracting two fields

I need some assistance, I am writing a script in bash. I want to do two things: 1/. I want to replace the LF at the end of the RFH    MQSTR  so I can process the file record by record using a while loop. 2/. I want to extract two fields from each record, they are identified with... (1 Reply)
Discussion started by: gugs
1 Replies

10. Shell Programming and Scripting

Removing certain fields from a file

Hi, I have a file namely 'inputs' The values inside the file are like this and seperated by a '|'. mani|21|CSE How can I extract the values from this file without the '|' symbol. Thanks in advance. (1 Reply)
Discussion started by: sendhilmani123
1 Replies
Login or Register to Ask a Question