I have ~100 text files in a directory that I am trying to parse and output to a new file. I am looking for the words chr,start,stop,ref,alt in each of the files. Those fields should appear somewhere in those files. The first two fields of each new set of rows is also printed. Since this is on a windows os I used "path\to\folder' in the bash
example of files to search (each is a seperate file)
Code:
name1 1111 chr start stop ref alt comment
1 10 25 a t snp
1 20 75 t - del
2 30 120 - a ins
10 10 80 a g snpname2 222 id chr start stop ref alt comment
1111 1 10 25 a g snpname3 333333 id symbol chr start stop ref alt comment
222 name 1 20 75 c - del
222 name 2 30 120 - t ins
desired output
Code:
name1 1111 chr start stop ref alt
1 10 25 a t
1 20 75 t -
2 30 120 - a
10 10 80 a gname2 222 chr start stop ref alt
1 10 25 a gname3 333333 chr start stop ref alt
1 20 75 c -
2 30 120 - t
Thank you .
bash tried
Code:
for f in "C:\Users/test\Desktop\file\folder*.txt" ; do
bname=${f##*/}
pref=${bname%%.bam}
awk "/chr/{found=1}/start/{if(found)/stop/{if(found)/ref/{if(found)/alt/{if(found) $f print > ${pref}_edit.txt
done
The input are excel xlsx files that I converted to text in VBA, so they should all be separated by a tab. The in input files are 133 individual text files with the column titles in random order. In some it will be chr,start,stop,ref,alt in others id,chr,start,stop,ref,alt and in others name,symbol,id,chr,start,stop,ref,alt. Does this help? Thank you .
Having 3 different kinds of 'columns' doesnt really help.
You know, you dont have to use awk, you could use regular scripting?
If that is easier for you, that is.
This said, counts for me too, here is somethign to get you started:
Code:
for f in *.dat ; do
bname=${f##*/}
#pref=${bname%%.bam} ## dont have that
#awk "/chr/{found=1}/start/{if(found)/stop/{if(found)/ref/{if(found)/alt/{if(found) $f print > ${pref}_edit.txt"
while read content_line
do
if echo "$content_line" | grep -q ^name
then
MODE="default" # Reset parse mode
echo "$content_line" | grep -v symbol | grep -q id && MODE=id
echo "$content_line" | grep -q symbol && MODE=symbol
fi
case $MODE in
default) while read chr start stop ref alt comment;do
line_print="$chr $start $stop $ref $alt $commet"
done<<<"$content_line" ##>> ccmcbabe.output
;;
id) echo "id handling" ;;
symbol) echo "symbol handling" ;;
esac
echo "$MODE :: $line_print"
done < "$f"
done
hth
EDIT:
Which then outputs as:
Code:
sh ccmbade.sh
default :: name1 1111 chr start stop
default :: 1 10 25 a t
default :: 1 20 75 t -
default :: 2 30 120 - a
default :: 10 10 80 a g
id handling
id :: 10 10 80 a g
id handling
id :: 10 10 80 a g
symbol handling
symbol :: 10 10 80 a g
symbol handling
symbol :: 10 10 80 a g
symbol handling
symbol :: 10 10 80 a g
0 ~/tmp $
cd path/to/folder
awk -f /path2/to/script *.txt > file.out
It might be too many files. Then you could:
Code:
for i in *.txt
do
cat "$i"
done |
awk -f /path2/to/script > file.out
--
Output with sample:
Code:
name1 1111 chr start stop ref alt
1 10 25 a t
1 20 75 t -
2 30 120 - a
10 10 80 a g
name2 222 chr start stop ref alt
1 10 25 a g
name3 333333 chr start stop ref alt
1 20 75 c -
2 30 120 - t
Hi Team,
i have a web ui where user will be passing values and the output will be saved to a file say test with the following contents .
These below mentioned values will change according to the user_input
Just gave here one example
Contents of file test is given below
Gateway... (7 Replies)
Hi ,
I have been trying to write a perl script to do this job. But i am not able to achieve the desired result. Below is my code.
my $current_value=12345;
my @users=("bob","ben","tom","harry");
open DBLIST,"<","/var/tmp/DBinfo";
my @input = <DBLIST>;
foreach (@users)
{
my... (11 Replies)
have a very big file where need to format it like below
example file:
abcd today
is
great
day;
search keyword 'abcd' and append to it all words till we reach ; to make it a single line.
output should look like.
abcd today is great day;
There are many occurrence of such... (2 Replies)
I have a huge list of files in an Unix directory (around 10000 files).
I need to be able to search for a certain keyword only within files that are modified between certain date and time, say for e.g 2012-08-20 12:30 to 2012-08-20 12:40
Can someone let me know what would be the fastest way... (10 Replies)
Hi all,
how to recursively search for a list of keywords in a given directory??
for example:
suppose i have kept all the keywords in a file called "procnamelist" (in separate line)
and i have to search recursively in a directory called "target/dir"
if i am not doing recursive search then... (4 Replies)
Hi all,
I am looking for a coomand to search for the keywords in susequenct lines. Keyword1 in a line and Keyword2 in the very next line.
Once i found the combination ineed to print the lines with patterns and the line above and one below.
I am giving an example here: Keywords are :ERROR and... (12 Replies)
Hi,
I have a file which contains the following :
select * from test where test_id=1;
select id
from test1, test2 where test_id=1 and test_id=2;
select * from
test1, test2, test3 where test_id=4 and test2_id where in (select test2_id from test2);
select
id1, id2 from test ... (6 Replies)
I wanted to search in all the sub directories under /vob/project (recurse) in everything inside /vob/project.
search.run
for x in `cat search.strings`
do
find /vob/project -type f -print | xargs grep -i $x > ~/$x.txt
done
search.string
hello
whoami
I am getting the error ... (5 Replies)