Text going to new line?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Text going to new line?
# 1  
Old 02-14-2014
Text going to new line?

Hi all, I am running an awk script to generate a file with several columns, in the output the contents of a single line gets continued to a second line . Why is this happening and how can it be fixed?
The output is fine on a windows machine except that it is showing in 2 lines instead of one.

when I look at just the first line in unix terminal, it looks jumbled up

Code:
head -1 result1.txt
comp1000E1Z7R6q1E1Z7R6_CHLVA.00%17357333967     XP_005850320.1  552841484; 307109981    571     GO:0003676; GO:0000166; GO:0008270al protein CHLUniRef100_E1Z7R6ChlorellUniRef90_E1Z7R6 UniRef50_E1Z7R6 UPI0001E83686           554065      20852019 GL433838        EFN58218.1

when I look at the same first line after transferring to windows

Code:
comp1000_c0_seq1 EFN58218 37.00%    967    1    713    254    670    466    571    3.00E-07    57    hypothetical protein CHLNCDRAFT_142094 [Chlorella variabilis]    nrplant
    E1Z7R6    E1Z7R6_CHLVA    17357333    XP_005850320.1    552841484; 307109981        GO:0003676; GO:0000166; GO:0008270        UniRef100_E1Z7R6    UniRef90_E1Z7R6    UniRef50_E1Z7R6    UPI0001E83686        554065            20852019    GL433838    EFN58218.1

the actual output should be a single line so that I can put into excel.
# 2  
Old 02-14-2014
What OS are you using to create result1.txt? What awk commands are you using to generate result1.txt on that OS?

How are you transferring result1.txt to your windows system?

Last edited by Don Cragun; 02-14-2014 at 12:51 PM.. Reason: Fix typo.
# 3  
Old 02-14-2014
Which application are you using to view the file in windows?

May be word wrap is turned ON!
# 4  
Old 02-14-2014
Pipe your output through a count (such as wc -l) to see how many records you are actually getting. Perhaps it's just cosmetic like Master Yoda suggested.



Robin
# 5  
Old 02-14-2014
Hi Don, I`m using tab as delimiter for the script. I have pasted the script below.
It works fine with smaller samples. Also I`m transferring file through Winscp software drag and drop.
Code:
more /etc/*-release
Red Hat Enterprise Linux Server release 5.9 (Tikanga)



Hi Yoda, the text wrap is not turned on, I`m using textpad and excel to view the file.


Here is some more information for the same output line that might be useful
Code:
head -1 result1.txt | od -c
0000000   c   o   m   p   1   0   0   0   _   c   0   _   s   e   q   1
0000020       E   F   N   5   8   2   1   8       3   7   .   0   0   %
0000040  \t   9   6   7  \t   1  \t   7   1   3  \t   2   5   4  \t   6
0000060   7   0  \t   4   6   6  \t   5   7   1  \t   3   .   0   0   E
0000100   -   0   7  \t   5   7  \t   h   y   p   o   t   h   e   t   i
0000120   c   a   l       p   r   o   t   e   i   n       C   H   L   N
0000140   C   D   R   A   F   T   _   1   4   2   0   9   4       [   C
0000160   h   l   o   r   e   l   l   a       v   a   r   i   a   b   i
0000200   l   i   s   ]  \t   n   r   p   l   a   n   t  \r  \t   E   1
0000220   Z   7   R   6  \t   E   1   Z   7   R   6   _   C   H   L   V
0000240   A  \t   1   7   3   5   7   3   3   3  \t   X   P   _   0   0
0000260   5   8   5   0   3   2   0   .   1  \t   5   5   2   8   4   1
0000300   4   8   4   ;       3   0   7   1   0   9   9   8   1  \t  \t
0000320   G   O   :   0   0   0   3   6   7   6   ;       G   O   :   0
0000340   0   0   0   1   6   6   ;       G   O   :   0   0   0   8   2
0000360   7   0  \t  \t   U   n   i   R   e   f   1   0   0   _   E   1
0000400   Z   7   R   6  \t   U   n   i   R   e   f   9   0   _   E   1
0000420   Z   7   R   6  \t   U   n   i   R   e   f   5   0   _   E   1
0000440   Z   7   R   6  \t   U   P   I   0   0   0   1   E   8   3   6
0000460   8   6  \t  \t   5   5   4   0   6   5  \t  \t  \t   2   0   8
0000500   5   2   0   1   9  \t   G   L   4   3   3   8   3   8  \t   E
0000520   F   N   5   8   2   1   8   .   1  \t  \t  \t  \t  \n

Code:
 head -1 result1.txt
comp1000E1Z7R6q1E1Z7R6_CHLVA.00%17357333967     XP_005850320.1  552841484; 307109981    571     GO:0003676; GO:0000166; GO:0008270al protein CHLUniRef100_E1Z7R6ChlorellUniRef90_E1Z7R6 UniRef50_E1Z7R6 UPI0001E83686           554065      20852019 GL433838        EFN58218.1

code

Code:
#!/bin/bash
  
  data[0]=""
  key[0]=""
  count=0
  
  search_add()
  {
    inkey=$1;indata=$2;action=$3
    if [ $action == ADD ]; then
      key[$count]=$inkey
      data[$count]=$indata
      ((count+=1))
      return 0
    elif [ $action == SEARCH ]; then
      found=""
      for((i=0;i<$count;i++))
      do
        if [ "${key[$i]}" == "$inkey" ]
        then
          found=${data[$i]}
          return 0
        fi
      done
    fi
    return 1
  }
  
  while read first sec remaining
  do
    pat=${sec#*|}; pat=${pat%|*}
    search_add $pat "" SEARCH
    if [ $? -ne 0 ]; then
      found=$( grep -m1 $pat idmapping_selected.tab )
      search_add $pat "$found" ADD
    fi
    echo -e "$first $sec $remaining\t$found\n" >> result1.txt
  
done < blastx_consol_tab.txt


Last edited by gina.lizar; 02-14-2014 at 01:53 PM..
# 6  
Old 02-14-2014
Quote:
Originally Posted by gina.lizar
Hi Don, I`m using tab as delimiter for the script. I have pasted the script below.
It works fine with smaller samples. Also I`m transferring file through Winscp software drag and drop.
more /etc/*-release
Red Hat Enterprise Linux Server release 5.9 (Tikanga)



Hi Yoda, the text wrap is not turned on, I`m using textpad and excel to view the file.


Here is some more information for the same output line that might be useful
Code:
head -1 result1.txt | od -c
0000000   c   o   m   p   1   0   0   0   _   c   0   _   s   e   q   1
0000020       E   F   N   5   8   2   1   8       3   7   .   0   0   %
0000040  \t   9   6   7  \t   1  \t   7   1   3  \t   2   5   4  \t   6
0000060   7   0  \t   4   6   6  \t   5   7   1  \t   3   .   0   0   E
0000100   -   0   7  \t   5   7  \t   h   y   p   o   t   h   e   t   i
0000120   c   a   l       p   r   o   t   e   i   n       C   H   L   N
0000140   C   D   R   A   F   T   _   1   4   2   0   9   4       [   C
0000160   h   l   o   r   e   l   l   a       v   a   r   i   a   b   i
0000200   l   i   s   ]  \t   n   r   p   l   a   n   t  \r  \t   E   1
0000220   Z   7   R   6  \t   E   1   Z   7   R   6   _   C   H   L   V
0000240   A  \t   1   7   3   5   7   3   3   3  \t   X   P   _   0   0
0000260   5   8   5   0   3   2   0   .   1  \t   5   5   2   8   4   1
0000300   4   8   4   ;       3   0   7   1   0   9   9   8   1  \t  \t
0000320   G   O   :   0   0   0   3   6   7   6   ;       G   O   :   0
0000340   0   0   0   1   6   6   ;       G   O   :   0   0   0   8   2
0000360   7   0  \t  \t   U   n   i   R   e   f   1   0   0   _   E   1
0000400   Z   7   R   6  \t   U   n   i   R   e   f   9   0   _   E   1
0000420   Z   7   R   6  \t   U   n   i   R   e   f   5   0   _   E   1
0000440   Z   7   R   6  \t   U   P   I   0   0   0   1   E   8   3   6
0000460   8   6  \t  \t   5   5   4   0   6   5  \t  \t  \t   2   0   8
0000500   5   2   0   1   9  \t   G   L   4   3   3   8   3   8  \t   E
0000520   F   N   5   8   2   1   8   .   1  \t  \t  \t  \t  \n

… … …
code

Code:
#!/bin/bash
  
  data[0]=""
  key[0]=""
  count=0
  
  search_add()
  {
    inkey=$1;indata=$2;action=$3
    if [ $action == ADD ]; then
      key[$count]=$inkey
      data[$count]=$indata
      ((count+=1))
      return 0
    elif [ $action == SEARCH ]; then
      found=""
      for((i=0;i<$count;i++))
      do
        if [ "${key[$i]}" == "$inkey" ]
        then
          found=${data[$i]}
          return 0
        fi
      done
    fi
    return 1
  }
  
  while read first sec remaining
  do
    pat=${sec#*|}; pat=${pat%|*}
    search_add $pat "" SEARCH
    if [ $? -ne 0 ]; then
      found=$( grep -m1 $pat idmapping_selected.tab )
      search_add $pat "$found" ADD
    fi
    echo -e "$first $sec $remaining\t$found\n" >> result1.txt
  
done < blastx_consol_tab.txt

The \r marked in red in the od output and the lack of a \r in the echo in the shell script indicates that blastx_consol_tab.txt has at least one line with a CR/LF line terminator instead of the conventional LF line terminator. There is a good chance that adding the command:
Code:
    remaining=${remaining%$'\r'}

(*** NOTE: I originally had # instead of % in the above statement. That was a typo. ***)
after:
Code:
    pat=${sec#*|}; pat=${pat%|*}

in the read loop will fix the problem.

If you created (or modified) blastx_consol_tab.txt using textpad, that may be what added the unwanted carriage return.

If you need the output being sent to windows to have a CR/LF line terminator instead of just a LF terminator, change:
Code:
    echo -e "$first $sec $remaining\t$found\n" >> result1.txt

to:
Code:
    echo -e "$first $sec $remaining\t$found\r\n" >> result1.txt

or, better yet, to:
Code:
    echo -e "$first $sec $remaining\t$found\r" >> result1.txt

(The echo should already be adding an NL as a line terminator, but we can't tell for sure because you used head -1 to limit the output we see to just the first line.)

Last edited by Don Cragun; 02-17-2014 at 02:49 PM.. Reason: Fix typo.
# 7  
Old 02-17-2014
Hi Don, thank you for your invaluable insights, a few more questions for you please.

I have manually edited blastx_consol_tab.txt
textpad, is it possible to remove unwanted carriage returns with a script?


The output
result1.txt takes 9 days to produce, is it possible to fix that without regenerating it ?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find line then evaluate text on next line, print when condition is met

Hello, I am looking for a specific situation in a text file. The conditions are, > <CompoundName> InChI=1S/C5H12NO2/c1-5(2)4-8-6(3)7/h5H,4H2,1-3H3/q+1 I am looking for cases where the line "> <CompoundName>" is followed by a line that contains the string "InChI=" without regard to... (5 Replies)
Discussion started by: LMHmedchem
5 Replies

2. Shell Programming and Scripting

Shell script UNIX to read text file line by line

i have a text file as belows, it includes 2 columns, 1st is the column name, 2nd is the file_name data_file.txt column_name file_name col1 file1 col2 file2 col3 file1 col4 file1 col5 file2 now, i would like to... (4 Replies)
Discussion started by: tester111
4 Replies

3. Shell Programming and Scripting

How to read a text file line by line and insert into a database table?

I have a test file that I want to read and insert only certain lines into the the table based on a filter. 1. Rread the log file 12 Hours back Getdate() -12 Hours 2. Extract the following information on for lines that say "DUMP is complete" A. Date B. Database Name C.... (2 Replies)
Discussion started by: JolietJake
2 Replies

4. Shell Programming and Scripting

Reading text file, comparing a value in a line, and placing only part of the line in a variable?

I need some help. I would like to read in a text file. Take a variable such as ROW-D-01, compare it to what's in one line in the text file such as PROD/VM/ROW-D-01 and only input PROD/VM into a variable without the /ROW-D-01. Is this possible? any help is appreciated. (2 Replies)
Discussion started by: xChristopher
2 Replies

5. Shell Programming and Scripting

Grabbing text and using that text in a newly created line

Hello, I am really stuck and I'm hoping somone can help. I have a text file that is similar to this: <--First User--> <function>account='uid=user1,....... <--Second User--> <function>account='uid=user2,.......What I want is to grab the usernames after "uid=" and before the following... (9 Replies)
Discussion started by: mafia910
9 Replies

6. Shell Programming and Scripting

Shell script to read a text file line by line & process it...

Hi , I am trying to write an shell, which reads a text file (from a location) having a list of numbers of strictly 5 digits only ex: 33144 Now my script will check : 1) that each entry is only 5 digits & numeric only, no alphabets, & its not empty. 2)then it executes a shell script called... (8 Replies)
Discussion started by: new_to_shell
8 Replies

7. Shell Programming and Scripting

get the fifth line of a text file into a shell script and trim the line to extract a WORD

FOLKS , i have a text file that is generated automatically of an another korn shell script, i want to bring in the fifth line of the text file in to my korn shell script and look for a particular word in the line . Can you all share some thoughts on this one. thanks... Venu (3 Replies)
Discussion started by: venu
3 Replies

8. Shell Programming and Scripting

Search text from a file and print text and one previous line too

Hi, Please let me know how to find text and print text and its previous line. Please don't get irritated few days back I asked text and next line. I am using HP-UX 11.11 Thanks for your help. (6 Replies)
Discussion started by: kamranjalal
6 Replies

9. Shell Programming and Scripting

How to insert some constant text at beginig of each line within a text file.

Dear Folks :), I am new to UNIX scripting and I do not know how can I insert some text in the first column of a UNIX text file at command promtp. I can do this in vi editor by using this command :g/^/s//BBB_ e,g I have a file named as Test.dat and it containins below text: michal... (4 Replies)
Discussion started by: Muhammad Afzal
4 Replies

10. Shell Programming and Scripting

Adding specific text and spaces to each line in a text file

Hi, I wanted to add specific text to each row in a text file containing three rows. Example: 0 8 7 6 5 5 7 8 9 0 7 9 7 8 9 0 1 2 And I want to add a 21 at the beginning of the first row, and blank spaces at the beginning of the second two rows. To get this: 21 0 8 7 6 5 5 7 8... (4 Replies)
Discussion started by: hertingm
4 Replies
Login or Register to Ask a Question