Hi all, I am running an awk script to generate a file with several columns, in the output the contents of a single line gets continued to a second line . Why is this happening and how can it be fixed?
The output is fine on a windows machine except that it is showing in 2 lines instead of one.
when I look at just the first line in unix terminal, it looks jumbled up
Code:
head -1 result1.txt
comp1000E1Z7R6q1E1Z7R6_CHLVA.00%17357333967 XP_005850320.1 552841484; 307109981 571 GO:0003676; GO:0000166; GO:0008270al protein CHLUniRef100_E1Z7R6ChlorellUniRef90_E1Z7R6 UniRef50_E1Z7R6 UPI0001E83686 554065 20852019 GL433838 EFN58218.1
when I look at the same first line after transferring to windows
Pipe your output through a count (such as wc -l) to see how many records you are actually getting. Perhaps it's just cosmetic like Master Yoda suggested.
Hi Don, I`m using tab as delimiter for the script. I have pasted the script below.
It works fine with smaller samples. Also I`m transferring file through Winscp software drag and drop.
Code:
more /etc/*-release
Red Hat Enterprise Linux Server release 5.9 (Tikanga)
Hi Yoda, the text wrap is not turned on, I`m using textpad and excel to view the file.
Here is some more information for the same output line that might be useful
Code:
head -1 result1.txt | od -c
0000000 c o m p 1 0 0 0 _ c 0 _ s e q 1
0000020 E F N 5 8 2 1 8 3 7 . 0 0 %
0000040 \t 9 6 7 \t 1 \t 7 1 3 \t 2 5 4 \t 6
0000060 7 0 \t 4 6 6 \t 5 7 1 \t 3 . 0 0 E
0000100 - 0 7 \t 5 7 \t h y p o t h e t i
0000120 c a l p r o t e i n C H L N
0000140 C D R A F T _ 1 4 2 0 9 4 [ C
0000160 h l o r e l l a v a r i a b i
0000200 l i s ] \t n r p l a n t \r \t E 1
0000220 Z 7 R 6 \t E 1 Z 7 R 6 _ C H L V
0000240 A \t 1 7 3 5 7 3 3 3 \t X P _ 0 0
0000260 5 8 5 0 3 2 0 . 1 \t 5 5 2 8 4 1
0000300 4 8 4 ; 3 0 7 1 0 9 9 8 1 \t \t
0000320 G O : 0 0 0 3 6 7 6 ; G O : 0
0000340 0 0 0 1 6 6 ; G O : 0 0 0 8 2
0000360 7 0 \t \t U n i R e f 1 0 0 _ E 1
0000400 Z 7 R 6 \t U n i R e f 9 0 _ E 1
0000420 Z 7 R 6 \t U n i R e f 5 0 _ E 1
0000440 Z 7 R 6 \t U P I 0 0 0 1 E 8 3 6
0000460 8 6 \t \t 5 5 4 0 6 5 \t \t \t 2 0 8
0000500 5 2 0 1 9 \t G L 4 3 3 8 3 8 \t E
0000520 F N 5 8 2 1 8 . 1 \t \t \t \t \n
Code:
head -1 result1.txt
comp1000E1Z7R6q1E1Z7R6_CHLVA.00%17357333967 XP_005850320.1 552841484; 307109981 571 GO:0003676; GO:0000166; GO:0008270al protein CHLUniRef100_E1Z7R6ChlorellUniRef90_E1Z7R6 UniRef50_E1Z7R6 UPI0001E83686 554065 20852019 GL433838 EFN58218.1
code
Code:
#!/bin/bash
data[0]=""
key[0]=""
count=0
search_add()
{
inkey=$1;indata=$2;action=$3
if [ $action == ADD ]; then
key[$count]=$inkey
data[$count]=$indata
((count+=1))
return 0
elif [ $action == SEARCH ]; then
found=""
for((i=0;i<$count;i++))
do
if [ "${key[$i]}" == "$inkey" ]
then
found=${data[$i]}
return 0
fi
done
fi
return 1
}
while read first sec remaining
do
pat=${sec#*|}; pat=${pat%|*}
search_add $pat "" SEARCH
if [ $? -ne 0 ]; then
found=$( grep -m1 $pat idmapping_selected.tab )
search_add $pat "$found" ADD
fi
echo -e "$first $sec $remaining\t$found\n" >> result1.txt
done < blastx_consol_tab.txt
Last edited by gina.lizar; 02-14-2014 at 02:53 PM..
Hi Don, I`m using tab as delimiter for the script. I have pasted the script below.
It works fine with smaller samples. Also I`m transferring file through Winscp software drag and drop.
more /etc/*-release
Red Hat Enterprise Linux Server release 5.9 (Tikanga)
Hi Yoda, the text wrap is not turned on, I`m using textpad and excel to view the file.
Here is some more information for the same output line that might be useful
Code:
head -1 result1.txt | od -c
0000000 c o m p 1 0 0 0 _ c 0 _ s e q 1
0000020 E F N 5 8 2 1 8 3 7 . 0 0 %
0000040 \t 9 6 7 \t 1 \t 7 1 3 \t 2 5 4 \t 6
0000060 7 0 \t 4 6 6 \t 5 7 1 \t 3 . 0 0 E
0000100 - 0 7 \t 5 7 \t h y p o t h e t i
0000120 c a l p r o t e i n C H L N
0000140 C D R A F T _ 1 4 2 0 9 4 [ C
0000160 h l o r e l l a v a r i a b i
0000200 l i s ] \t n r p l a n t \r \t E 1
0000220 Z 7 R 6 \t E 1 Z 7 R 6 _ C H L V
0000240 A \t 1 7 3 5 7 3 3 3 \t X P _ 0 0
0000260 5 8 5 0 3 2 0 . 1 \t 5 5 2 8 4 1
0000300 4 8 4 ; 3 0 7 1 0 9 9 8 1 \t \t
0000320 G O : 0 0 0 3 6 7 6 ; G O : 0
0000340 0 0 0 1 6 6 ; G O : 0 0 0 8 2
0000360 7 0 \t \t U n i R e f 1 0 0 _ E 1
0000400 Z 7 R 6 \t U n i R e f 9 0 _ E 1
0000420 Z 7 R 6 \t U n i R e f 5 0 _ E 1
0000440 Z 7 R 6 \t U P I 0 0 0 1 E 8 3 6
0000460 8 6 \t \t 5 5 4 0 6 5 \t \t \t 2 0 8
0000500 5 2 0 1 9 \t G L 4 3 3 8 3 8 \t E
0000520 F N 5 8 2 1 8 . 1 \t \t \t \t \n
… … …
code
Code:
#!/bin/bash
data[0]=""
key[0]=""
count=0
search_add()
{
inkey=$1;indata=$2;action=$3
if [ $action == ADD ]; then
key[$count]=$inkey
data[$count]=$indata
((count+=1))
return 0
elif [ $action == SEARCH ]; then
found=""
for((i=0;i<$count;i++))
do
if [ "${key[$i]}" == "$inkey" ]
then
found=${data[$i]}
return 0
fi
done
fi
return 1
}
while read first sec remaining
do
pat=${sec#*|}; pat=${pat%|*}
search_add $pat "" SEARCH
if [ $? -ne 0 ]; then
found=$( grep -m1 $pat idmapping_selected.tab )
search_add $pat "$found" ADD
fi
echo -e "$first $sec $remaining\t$found\n" >> result1.txt
done < blastx_consol_tab.txt
The \r marked in red in the od output and the lack of a \r in the echo in the shell script indicates that blastx_consol_tab.txt has at least one line with a CR/LF line terminator instead of the conventional LF line terminator. There is a good chance that adding the command:
Code:
remaining=${remaining%$'\r'}
(*** NOTE: I originally had # instead of % in the above statement. That was a typo. ***)
after:
Code:
pat=${sec#*|}; pat=${pat%|*}
in the read loop will fix the problem.
If you created (or modified) blastx_consol_tab.txt using textpad, that may be what added the unwanted carriage return.
If you need the output being sent to windows to have a CR/LF line terminator instead of just a LF terminator, change:
(The echo should already be adding an NL as a line terminator, but we can't tell for sure because you used head -1 to limit the output we see to just the first line.)
Last edited by Don Cragun; 02-17-2014 at 03:49 PM..
Reason: Fix typo.
Hello,
I am looking for a specific situation in a text file. The conditions are,
> <CompoundName>
InChI=1S/C5H12NO2/c1-5(2)4-8-6(3)7/h5H,4H2,1-3H3/q+1
I am looking for cases where the line "> <CompoundName>" is followed by a line that contains the string "InChI=" without regard to... (5 Replies)
i have a text file as belows, it includes 2 columns, 1st is the column name, 2nd is the file_name
data_file.txt
column_name file_name
col1 file1
col2 file2
col3 file1
col4 file1
col5 file2
now, i would like to... (4 Replies)
I have a test file that I want to read and insert only certain lines into the
the table based on a filter.
1. Rread the log file 12 Hours back Getdate() -12 Hours
2. Extract the following information on for lines that say "DUMP is
complete"
A. Date
B. Database Name
C.... (2 Replies)
I need some help. I would like to read in a text file.
Take a variable such as ROW-D-01, compare it to what's in one line in the text file such as PROD/VM/ROW-D-01 and only input PROD/VM into a variable without the /ROW-D-01.
Is this possible? any help is appreciated. (2 Replies)
Hello,
I am really stuck and I'm hoping somone can help. I have a text file that is similar to this:
<--First User-->
<function>account='uid=user1,.......
<--Second User-->
<function>account='uid=user2,.......What I want is to grab the usernames after "uid=" and before the following... (9 Replies)
Hi ,
I am trying to write an shell, which reads a text file (from a location) having a list of numbers of strictly 5 digits only ex: 33144
Now my script will check :
1) that each entry is only 5 digits & numeric only, no alphabets, & its not empty.
2)then it executes a shell script called... (8 Replies)
FOLKS ,
i have a text file that is generated automatically of an another korn shell script, i want to bring in the fifth line of the text file in to my korn shell script and look for a particular word in the line . Can you all share some thoughts on this one.
thanks...
Venu (3 Replies)
Hi,
Please let me know how to find text and print text and its previous line. Please don't get irritated few days back I asked text and next line. I am using HP-UX 11.11
Thanks for your help. (6 Replies)
Dear Folks :),
I am new to UNIX scripting and I do not know how can I insert some text in the first column of a UNIX text file at command promtp.
I can do this in vi editor by using this command :g/^/s//BBB_
e,g I have a file named as Test.dat and it containins below text:
michal... (4 Replies)
Hi,
I wanted to add specific text to each row in a text file containing three rows. Example:
0 8 7 6 5 5
7 8 9 0 7 9
7 8 9 0 1 2
And I want to add a 21 at the beginning of the first row, and blank spaces at the beginning of the second two rows. To get this:
21 0 8 7 6 5 5
7 8... (4 Replies)