cmd sequence to find & cut out a specific string


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers cmd sequence to find & cut out a specific string
# 1  
Old 10-20-2008
cmd sequence to find & cut out a specific string

A developer of mine has this requirement - I couldn't tell her quickly how to do it with UNIX commands or a quick script so she's writing a quick program to do it - but that got my curiousity up and thought I'd ask here for advice.

In a text file, there are some records (about half of them) that have a specific string, say "ABC" followed by a 15 digit number, always at least 2 leading zeros. In rows that have this, it will appear twice, identically.
I essentially want to cut out these 18 chars into a file of their own. But, they are not in a fixed column position within the file.

Logically, the task is:
a) find the rows with ABC00
b) get the position of that first A
c) cut starting at that position for 18 characters and write to a new file.

example data:
ab cdefgABC000000000012345ABC000000000012345sadlfk
abcde fgABC000000000012346ABC000000000012346sadlfk
abc defgghi jklmn1349d5sadlfk
abcdef sldkfdgABC000000000056789ABC000000000056789abcdlkdfj134239d


and so on.

Desired output
ABC00000000012345
ABC00000000012346
ABC00000000056789

Thanks for having a look.
Lisa
# 2  
Old 10-20-2008
Hammer & Screwdriver One approach

Lisa,
There are probably many, but here is one approach --

Code:
> sed "s/ABC[0-9][0-9]/~+&/" file220 | tr "~" "\n" | grep "+" | cut -c2-19
ABC000000000012345
ABC000000000012346
ABC000000000056789

# 3  
Old 10-20-2008
wow! that's slick - and it worked on my data stream so 1000s of thanks. Now, the ethical dilemma, do I just give it to the developer as if I did it or do I 'fess up that I asked for help.

Lisa
# 4  
Old 10-20-2008
Hammer & Screwdriver under the assumption that no programming is ever truly unique and created...

You found a solution and verified it works.

Most every problem has already been pondered and solved, so there truly are no "new" answers. Ha ha

Back to the initial problem, the creative use of sed to place extra characters and then tr to convert them so a grep and cut can extract them -- is one useful process to pull apart data records.

Let him think you were the genius.
# 5  
Old 10-21-2008
Inevitably, a perl approach Smilie

Code:
perl -ne '/(ABC00\d{13})/ && print "$1\n"' list.txt

 
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to find a specific sequence pattern in a fasta file?

I have to mine the following sequence pattern from a large fasta file namely gene.fasta (contains multiple fasta sequences) along with the flanking sequences of 5 bases at starting position and ending position, AAGCZ-N16-AAGCZ Z represents A, C or G (Except T) N16 represents any of the four... (3 Replies)
Discussion started by: dineshkumarsrk
3 Replies

2. Shell Programming and Scripting

How to find a file with a specific pattern for current sysdate & upon find email the details?

I need assistance with following requirement, I am new to Unix. I want to do the following task but stuck with file creation date(sysdate) Following is the requirement I need to create a script that will read the abc/xyz/klm folder and look for *.err files for that day’s date and then send an... (4 Replies)
Discussion started by: PreetArul
4 Replies

3. Shell Programming and Scripting

Cut & Fetch word from string

I have a file with some SQL query, I want to fetch only Table Name from that file line by line. INPUT FILE SELECT * FROM $SCHM.TABLENAME1; ALTER TABLE $SCHM.TABLENAME1 ADD DateOfBirth date; INSERT INTO $SCHM.TABLENAME1 (CustomerName, Country) SELECT SupplierName, Country FROM $SCHM.TABLENAME2... (2 Replies)
Discussion started by: Pratik Majithia
2 Replies

4. Shell Programming and Scripting

Cut cmd with delimiter as |#|

Hi All- We have a file data as below with delimiter as |#| 10|#|20|#|ABC 13|#|23|#|PBC If I want to cut the 2nd field out of this, below command is not working as multiple pipe is causing an issue , it seems cut -f2 -d"|#|" <file_name> can you please help to provide the correct command... (7 Replies)
Discussion started by: sureshg_sampat
7 Replies

5. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

6. Shell Programming and Scripting

How to cut string and find missing pattern?

i have list in file named sample.txt eg i want to cut the 3rd and 4th character i.e. 01,02,03....,24(max length is 24) and i want to find the missing sequence .and display them i.e. (15 Replies)
Discussion started by: sagar_1986
15 Replies

7. Shell Programming and Scripting

Find and replace a string a specific value in specific location in AIX

Hi, I have following samp.txt file in unix. samp.txt 01Roy2D3M000000 02Rad2D3M222222 . . . . 10Mik0A2M343443 Desired Output 01Roy2A3M000000 02Rad2A3M222222 . . (5 Replies)
Discussion started by: techmoris
5 Replies

8. Shell Programming and Scripting

Find & Replace string in multiple files & folders using perl

find . -type f -name "*.sql" -print|xargs perl -i -pe 's/pattern/replaced/g' this is simple logic to find and replace in multiple files & folders Hope this helps. Thanks Zaheer (0 Replies)
Discussion started by: Zaheer.mic
0 Replies
Login or Register to Ask a Question