Append next line to previous lines when NF is less than 0


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Append next line to previous lines when NF is less than 0
# 8  
Old 04-17-2014
Quote:
Originally Posted by cumeh1624
It means the very line has one field without cedilla as a field seperator or it has a blank line.
No.
With the awk script:
Code:
awk -FÇ '{print NF, $0}'

The number of fields printed as the 1st field in the output will be the number of Ç characters present on the line plus 1 for any line that contains any characters other than the terminating <newline> character. The only lines that will have NF < 1 (i.e., NF == 0) will be empty lines. Blank lines (lines containing only <space> and <tab> characters and the terminating <newline> character) that are not empty line (lines containing only the <newline> character) will have NF == 1 when the field separator is Ç.
# 9  
Old 04-20-2014
The command we had in the script that performs the next line append to previous line takes almost 4 hrs because of the while loop and for performance reason is why we are looking to use a faster command like awk command but your lines of code do not provide same data file count

The lines of code in the script that takes four hrs to complete has output data file count of 176060 more than what your lines of code will produce at 175044 data file count.

Below is what the lines of code that we had in the script looks like

Code:
FLAG=0
cat filename | while read CUR_LINE
do
if [[ $FLAG -ne 0 ]];then
If [[ `echo ${CUR_LINE} | awk -F "Ç" '{print NF -1}'` -le 0 ]];then
PREV_LINE="{PREV_LINE} ${CUR_LINE}"
NEW_LINE=`echo ${PREV_LINE} | tr -d '\n' | tr -d '^M'`
PREV_LINE=${NEW_LINE}
else
echo ${PREV_LINE} >> ${OUT_FILE}
PREV_LINE='${CUR_LINE}
fi
else
PREV_LINE=${CUR_LINE}
FLAG=1
fi
done
echo ${PREV_LINE} >> ${OUT_FILE}

what I'm looking is to reduce the 4 hrs completion time in which your line of code will do but the output file count is different and formatting is also different.

Please let me know if you have a suggestion to this issue.

Last edited by Scrutinizer; 04-20-2014 at 01:19 PM.. Reason: CODE tags
# 10  
Old 04-20-2014
That script is indeed inefficient and would take a long time, but this cannot be the actual script, since it contains several syntactical errors and also, the last line will probably be deleted (since in most shells this while loop will get executed in a subshell because of the pipe)

Please post a relevant input file and desired output and specify what OS and version you are using. Also, are there carriage returns in your input file?
# 11  
Old 04-20-2014
After reformatting your code so we can see the structure, getting rid of the subshell issue Scrutinizer mentioned, adding missing <dollar-sign> characters, changing <single-quote> characters to <double-quote> characters, and adding missing <double-quote> characters to get around syntax errors:
Code:
OUT_FILE=out
FLAG=0
while read CUR_LINE
do
        if [[ $FLAG -ne 0 ]]
        then
                if [[ `echo ${CUR_LINE} | awk -F "Ç" '{print NF -1}'` -le 0 ]]
                then
                        PREV_LINE="${PREV_LINE} ${CUR_LINE}"
                        NEW_LINE=`echo ${PREV_LINE} | tr -d '\n' | tr -d '^M'`
                        PREV_LINE="${NEW_LINE}"
                else
                        echo ${PREV_LINE} >> ${OUT_FILE}
                        PREV_LINE="${CUR_LINE}"
                fi
        else
                PREV_LINE="${CUR_LINE}"
                FLAG=1
        fi
done < filename
echo ${PREV_LINE} >> ${OUT_FILE}

we can see that this is grossly inefficient code. Having a while loop is not your problem, executing awk once for each of your 1.7 million input lines (except the 1st ) and tr twice for both empty lines and lines with only one field (especially since one of those invocations of tr is always a no-op) is going to be extremely slow.

Your code seems to be trying to remove <carriage-return> characters from your input (which you never mentioned were present before). And, we can't tell if you're trying to remove <carriage-return> or circumflex and upper-case M characters. (The above code removes all circumflex and upper-case M characters from your input.)

It also converts all sequences of one or more adjacent <space> and <tab> characters to a single <space> character (which again was not mentioned as a requirement until now). Is this intentional, or an accident? Or does your input contain no <tab> characters and no occurrences of multiple adjacent <space> characters?

It gets rid of backslash characters at the ends of input lines and joins lines that end with <backslash> characters no matter how many fields are on the joined lines. Is this intentional, or an accident? Or, are you sure that none of your input lines end with a <backslash> character just before a <newline> character?

And, depending on what shell you're using and what operating system you're using, any other <backslash> characters in your input could be deleted or converted to other characters by your uses of echo.

Please show us the code you are really using. Please also upload a SMALL sample input file (not more than 50 lines) that contains examples of all of the transformations that need to take place while removing characters, joining lines, and squeezing blanks, AND upload the desired output corresponding to that input. I explicitly say upload because we need to be sure that we will be able to see the difference between spaces and tabs in your desired input and output and see the <carriage-return> characters in your input.
# 12  
Old 04-20-2014
Hi Don,
That's exactly what the code looks like and all I'm looking for is to reduce the completion time, I didn't mention the removal of new line, squeezing blanks and control M character because if I can figure out how to reduce the completion time, I can easily implement other functionalities.

I'm not allowed to sample the data file but if you can suggest a way around it or commands to be used to reduce the completion time, I will really appreciate it.

Cumeh1624
# 13  
Old 04-20-2014
The code you showed us had syntax errors and would not run with any shell we have ever seen.

The code we have suggested would do exactly what you asked for, but clearly doesn't do what you want with the data you have. If you aren't able to give us a representative sample of data (scrubbed of any private data) that we can use to see what you're actually trying to do, we can't help you.

You have said that the script we have provided don't correctly format your output. How can we possibly guess at what that means if we can't see representative input and desired corresponding output?
# 14  
Old 04-21-2014
Code:
awk -F "Ç" 'NR == 1 {p = $0; next}
  NF > 1 {gsub("^M", X, p); print p; p = $0; next}
  {p = (p " " $0)}
  END {gsub("^M", X, p); print p}' filename

This User Gave Thanks to SriniShoo For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove new line starting with a numeric value and append it to the previous line

Hi, i have a file with multiple entries. After some tests with sed i managed to get the file output as follows: lsn=X-LINK-IN0,apc=661:0,state=avail,avail/links=1/1, 00,2110597,2094790,0,81,529,75649011,56435363, lsn=TM1ITP1-AM1ITP1-LS,apc=500:0,state=avail,avail/links=1/1,... (5 Replies)
Discussion started by: nms
5 Replies

2. Shell Programming and Scripting

Issue while append to previous line

Hi, I have data as below. 36578019,005-923887317,UNMDL,20151230,2C3CCAAG4GH135448,L,TX,20160108,62,"030916 PPT TX AFF RPRT VALID AFF IN PDP WLL FWD TO RYAN ON 031116 CB1619 ",, 36580219,611-923785453,FC,20151209,ZACCJABT9FPC19274,L,TX,20160108,83,,,... (4 Replies)
Discussion started by: JSKOBS
4 Replies

3. UNIX for Advanced & Expert Users

How to find a string in a line in UNIX file and delete that line and previous 3 lines ?

Hi , i have a file with data as below.This is same file. But actual file contains to many rows. i want to search for a string "Field 039 00" and delete that line and previous 3 lines in that file.. Can some body suggested me how can i do using either sed or awk command ? Field 004... (7 Replies)
Discussion started by: vadlamudy
7 Replies

4. UNIX for Dummies Questions & Answers

How to remove fields space and append next line to previous line.?

awk 'BEGIN{FS = "Ç"} NR == 1 {p = $0; next} NF > 1 {print p; p = $0} NF <= 1 {p = (p " " $0)} END {print p}' input.txt > output.txt This is what the input data file looks like with broken lines Code: 29863 Ç890000000 Ç543209911 ÇCHNGOHG Ç000000001 Ç055 ... (4 Replies)
Discussion started by: cumeh1624
4 Replies

5. Shell Programming and Scripting

Remove previous line if next & previous lines have same 4th character.

I want to remove commands having no output. In below text file. bash-3.2$ cat abc_do_it.txt grpg10so>show trunk group all status grpg11so>show trunk group all status grpg12so>show trunk group all status GCPKNYAIGT73IMO 1440 1345 0 0 94 0 0 INSERVICE 93% 0%... (4 Replies)
Discussion started by: Raza Ali
4 Replies

6. Shell Programming and Scripting

Append next line to previous line when one pattern not found

Hi, I need help for below scenario.I have a flat file which is having records seperated by delimiters which will represent each record for oracle table.My Control file will consider each line as one record for that table. Some of the lines are aligned in two/three lines so that records are... (4 Replies)
Discussion started by: kannansr621
4 Replies

7. Shell Programming and Scripting

Append specific lines to a previous line based on sequential search criteria

I'll try explain this as best I can. Let me know if it is not clear. I have large text files that contain data as such: 143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test line 2 test line 3 test 143593503 09-08-20 09:02:13... (3 Replies)
Discussion started by: jesse
3 Replies

8. Shell Programming and Scripting

Append each line to next previous line in a file

Hi all, Please help me in providing sample code to append the following 4 lines in one row. Input : A1/EXT "BAPBSC10/07B/00" 523 090530 0115 RXOCF-430 HY1711 1 EXTERNAL ALARM DOOR ALARM Output should be : A1/EXT "BAPBSC10/07B/00" 523 090530 0115 ... (8 Replies)
Discussion started by: sudhakaryadav
8 Replies

9. UNIX for Advanced & Expert Users

append the line with the previous if it not start with 1=

How to append the line with the previous if it not start with 1=. 1=ttt, 2=xxxxxx, 3=4545 44545, 4=66666, 1=ttt, 2=xxxxxx, 3=34434 3545, 4=66666, 5=ffffff 6=uuuuuuu, 7=ooooooo 1=ttt, 2=xxxxxx, 3=311343545, 4=66666 1=ttt, 2=xxxxxx, 5=XAXAXA, 7=FDFD (3 Replies)
Discussion started by: palsevlohit_123
3 Replies

10. Shell Programming and Scripting

Append line that does not contain pipe to it previous line

Hi All, I have a file which contains data as below When we see no pipe character in the line. append those lines to the previous line with pipe character till we get the next line with pipe character with ~(concat with ~) Input file looks like: 1080530944|001|john.l.bonner|Acknowledge|CN... (11 Replies)
Discussion started by: ainuddin
11 Replies
Login or Register to Ask a Question