Help required on Length based lookup


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help required on Length based lookup
# 8  
Old 12-19-2014
This is as far as I can get. The output covers most of what you require. Please make sure your next request is way more precise. Try
Code:
awk     'FNR==NR        {A[$1]=length($1);next}
                        {B[$1]=length($1)}
         END            {print "match"
                         for (b in B) {
                           for (a in A) if (b~"^"a) {print b; delete B[b]; DEL[a]++;break}
                          }
                         for (d in DEL) delete A[d]
                         print "a"
                         for (a in A) print a
                         print "b"
                         for (b in B) print b
                        }
        ' abc bcd
match
1224
120
122
123
12345
1201
1203
a
1245
123456
191
1890
b
121
199

# 9  
Old 12-19-2014
Just for comparison...
This produces the same output as the awk solutions except 123456 is gone. Can't get rid of 121 though.

Code:
while read x
do
    grep ${x:0:3} bcd >> match
    if [[ ! $(grep ${x:0:3} bcd) ]]; then echo $x >> nomatch; fi
done < abc
sort -u match
echo
sort -u nomatch
echo

while read z
do
    if [[ ! $(grep ${z:0:3} abc) ]]; then echo $z >> noabc; fi
done < bcd
sort -u noabc

# ------------
# 120
# 1201
# 1203
# 122
# 1224
# 123
# 12345
# 
# 1245
# 1890
# 191
# 
# 121
# 199

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Outputting sequences based on length with sed

I have this file: >ID1 AA >ID2 TTTTTT >ID-3 AAAAAAAAA >ID4 TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC Add I am trying to use this script to output sequences longer than 15 characters: sed -r '/^>/N;{/^.{,15}$/d}' The desire output would be this: >ID4... (8 Replies)
Discussion started by: Xterra
8 Replies

2. Shell Programming and Scripting

Append 0's based on length

I'm having data like this, "8955719","186497034","0001","M","3" "8955719","186497034","0002","M","10" "8955719","186497034","0003","M","10" "8955719","186497034","0004","M","3" "8955723","186499034","0001","M","3" "8955723","186499034","0002","M","10" "8955723","186499034","0003","M","10"... (3 Replies)
Discussion started by: Artlk
3 Replies

3. Shell Programming and Scripting

Filtering duplicates based on lookup table and rules

please help solving the following. I have access to redhat linux cluster having 32gigs of ram. I have duplicate ids for variable names, in the file 1,2 are duplicates;3,4 and 5 are duplicates;6 and 7 are duplicates. My objective is to use only the first occurrence of these duplicates. Lookup... (4 Replies)
Discussion started by: ritakadm
4 Replies

4. Shell Programming and Scripting

Append spaces the rows to make it into a required fixed length file

I want to make a script to read row by row and find its length. If the length is less than my required length then i hav to append spaces to that paritucular row. Each row contains special characters, spaces, etc. For example my file contains , 12345 abcdef 234 abcde 89012 abcdefgh ... (10 Replies)
Discussion started by: Amrutha24
10 Replies

5. UNIX for Dummies Questions & Answers

Length of a segment based on coordinates

Hi, I would like to have the length of a segment based on coordinates of its parts. Example input file: chr11 genes_good3.gtf aggregate_gene 1 100 gene1 chr11 genes_good3.gtf exonic_part 1 60 chr11 genes_good3.gtf exonic_part 70 100 chr11 genes_good3.gtf aggregate_gene 200 1000 gene2... (2 Replies)
Discussion started by: fadista
2 Replies

6. UNIX for Dummies Questions & Answers

Sorting words based on length

i need to write a bash script that recive a list of varuables kaka pele ronaldo beckham zidane messi rivaldo gerrard platini i need the program to print the longest word of the list. word in the output appears on a separate line and word order in the output is in the order Llachsicografi costs.... (1 Reply)
Discussion started by: yairpg
1 Replies

7. Shell Programming and Scripting

Split strings based on length

Hi All I am very much in need of help splitting strings based on length in Perl. e.g., Input text is : International NOUN Corp. NOUN 's POS Tulsa NOUN Output I want is : International I In Int Inte l al nal onal NOUN Corp. C Co Cor Corp . p. rp. orp. NOUN... (2 Replies)
Discussion started by: my_Perl
2 Replies

8. Shell Programming and Scripting

SED based on file lookup

Newb here trying to figure this one out. :confused: I am trying to create a SED (or some other idea) line that will replace the data field if the original text is seen in a separate text file. The lookup file would be line delimted. For example: sed 's/<if in file>/YES/' File structure:... (3 Replies)
Discussion started by: sdlennon
3 Replies

9. UNIX for Advanced & Expert Users

Clueless about how to lookup and reverse lookup IP addresses under a file!!.pls help

Write a quick shell snippet to find all of the IPV4 IP addresses in any and all of the files under /var/lib/output/*, ignoring whatever else may be in those files. Perform a reverse lookup on each, and format the output neatly, like "IP=192.168.0.1, ... (0 Replies)
Discussion started by: choco4202002
0 Replies

10. UNIX for Dummies Questions & Answers

Need find a file based length

Can some please help me? Want to find files over 35 characters in length? I am running HPUX. Would it be possible with find? Thanks in advance (8 Replies)
Discussion started by: J_ang
8 Replies
Login or Register to Ask a Question