Removing extra new line characters


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Removing extra new line characters
# 1  
Old 10-03-2011
Removing extra new line characters

Hello,

I have a text file that looks like:

Code:
ABC123|some text|some more text|00001
00002
0003
0004
000019|000003|Item

I have searched and found an example to remove the extra new line characters using grep and sed, but it (I think) assumes the lines start with a number and the column separator is ','. My file can start with chars or numbers and has a pipe for the column separator. I need to replace the extra new lines with a space and leave the last new line character. Any suggestions please?

Here is the sample from another thread:

Code:
echo `/usr/bin/grep -vE "^[:blank:]*$" inputfile | sed 's|^\([0-9]\)|:\1|;s|\([0-9]\)$|\1:|' | tr '\n' ' '`| sed 's|: *:|:|g' | awk -F: '{print$0}' RS=:

Thanks.
# 2  
Old 10-03-2011
How can we tell where the line really ends? Is there supposed to be a fixed number of columns?
# 3  
Old 10-04-2011
reply to message

Thanks.

Yes, sorry, there are 44 columns in the line.
# 4  
Old 10-04-2011
Code:
$ cat mergecol.awk
BEGIN { FS="|"; MAX=4; L=0      }

{
        for(N=1; N<=NF; N++)
        {
                A[L++]=$N;

                if(L >= MAX)
                {
                        printf("%s", A[0]);
                        for(M=1; M<L; M++) printf("%s%s", FS, A[M]);
                        printf("\n");
                        L=0;
                }
        }
}
END {
        if(L<1) exit;

        printf("%s", A[0]);
        for(M=1; M<L; M++) printf("%s%s", FS, A[M]);
        printf("\n");
}

$ awk -f mergecol.awk < data
ABC123|some text|some more text|00001
00002|0003|0004|000019
000003|Item
$

Smaller number of columns used for testing purposes. Change MAX=4 to MAX=44 for your data.
# 5  
Old 10-04-2011
reply

Thanks for the reply. I forgot one thing.

original:
Code:
ABC123|some text|some more text|00001
00002
0003
0004
000019|000003|Item

The lines need to be concatenated together with a space between, rather than separated. So, the contents would be one column instead of 4. Is there a mod I can make to the awk to support that?

Code:
ABC123|some text|some more text|00001 00002 0003 0004 00019|000003|Item


Thanks.
# 6  
Old 10-04-2011
That would have been nice to know. Working on it.

---------- Post updated at 10:39 AM ---------- Previous update was at 10:23 AM ----------

Code:
$ cat mergecol2.awk

BEGIN { FS="|"; MAX=4; L=0      }

NF==1 { A[L]=A[L] " " $1; }

NF>1 {
        for(N=1; N<=NF; N++)
        {
                if(L < (MAX-1))
                {
                        if(A[L]) L++;
                        A[L]=$N;
                        continue;
                }

                printf("%s", A[0]); A[0]="";
                for(M=1; M<=L; M++)
                {
                        printf("%s%s", FS, A[M]);
                        A[M]="";
                }
                printf("\n");
                L=0;
        }
}

END {
                if(L == 0) exit;

                printf("%s", A[0]);
                for(M=1; M<=L; M++)     printf("%s%s", FS, A[M]);
                printf("\n");
}

$ awk -f mergecol2.awk < data
ABC123|some text|some more text|00001 00002 0003 0004
000019|000003|Item
$

This User Gave Thanks to Corona688 For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing extra lines from file

I have a file where data looks like this: === <?xml version="1.0" encoding="utf-8"?> <xml xmlns:s='uuid:XYZ' xmlns:dt='uuid:ABC' xmlns:rs='urn:schemas-microsoft-com:rowset' xmlns:z='#RowsetSchema'> <s:Schema id='RowsetSchema'> <s:ElementType name='row'... (7 Replies)
Discussion started by: vx04
7 Replies

2. Shell Programming and Scripting

sed - Removing all characters from token to end of line

Hello. The token is any printable characters between 2 " . The token is unknown, but we know that it is between 2 " Tok 1 : "1234x567" Tok 2 : "A3b6+None" Tok 3 : "A3b6!1234=@" The ligne is : Line 1 : "9876xABCDE"Do you have any code fragments or data samples in your post Line 2 : ... (3 Replies)
Discussion started by: jcdole
3 Replies

3. Shell Programming and Scripting

Removing extra unwanted spaces

hi, i need to remove the extra spaces in the 2nd field. Sample: abc|bd |bkd123 .. 1space abc|badf |bakdsf123 .. 2space abc|bqe |bakuowe .. 3space Output: abc|bd|bkd123 abc|badf|bakdsf123 abc|bqe|bakuowe i used the following command, (9 Replies)
Discussion started by: anshaa
9 Replies

4. Shell Programming and Scripting

Removing one or more blank characters from beginning of a line

Hi, I was trying to remove the blank from beginning of a line. when I try: sed 's/^ +//' filename it does not work but when I try sed 's/^ *//' filename it works But I think the first command should have also replaced any line with one or more blanks. Kindly help me in understanding... (5 Replies)
Discussion started by: babom
5 Replies

5. UNIX for Dummies Questions & Answers

Removing Extra Folders From a TAR

I use an extremely simple TAR function for files at work and I have a question about cleaning them up. My command is TAR -cvf ExampleTarName.tar then the folder I wish to TAR. When my TAR finishes and I double click it to check it unarchived beautifully (I don't do this with every file, duh)... (5 Replies)
Discussion started by: Dogtown24
5 Replies

6. Shell Programming and Scripting

Removing characters from end of line (length unknown)

Hi I have a file which contains wrong XML, There are some garbage characters at the end of line that I want to get rid of. Example: <request type="product" ><attributes><pair><name>q</name><value><!]></value></pair><pair><name>start</name><value>1</value></pair></attributes></request>�J ... (7 Replies)
Discussion started by: dirtyd0ggy
7 Replies

7. UNIX for Dummies Questions & Answers

awk for removing special characters and extra commas

Hi, I have a .csv file which as empty lines with comma and some special characters in 3rd column as below. Source data 1,2,3,4,%#,6 ,,,,,, 1,2,3,4,5,6 Target Data 1,2,3,4,5,6I need to remove blank lines and special charcters I am trying to get this using the below awk awk -F","... (2 Replies)
Discussion started by: shruthidwh
2 Replies

8. UNIX for Dummies Questions & Answers

Help with Removing extra characters in Filename

Hi, It's my first time here... anyways, I have a simple problem with these filenames. This is probably too easy for you guys: ABC_20101.2A.2010_01 ABD_20103.2E.2010_04 ABE_20107.2R.2010_08 Expected Output: ABC_20101 ABD_20103 ABE_20107 The only pattern available are the ff: 1) All... (9 Replies)
Discussion started by: Joule
9 Replies

9. UNIX for Dummies Questions & Answers

help to replace extra new line characters

Hi my file data is like below ramu,sony,"raju \n ravi \n ramya" \n ravi,sarah,"sowmya \n sorry s\ sangam" \n i want replace new line characters in between double coats with sinhle space. for example cat input_file ramu,sony,"raju ravi ramya" ravi,sarah,"sowmya sorry sangam" ... (3 Replies)
Discussion started by: Raghava
3 Replies

10. UNIX for Dummies Questions & Answers

removing linux/extra partition??

ok, well i never could get my internet connection setup in linux so now it is just wasting space on my system... so, how do i get rid of it and the extra partition made during install?? (1 Reply)
Discussion started by: justchillin
1 Replies
Login or Register to Ask a Question