Match partial text


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match partial text
# 1  
Old 03-07-2015
Match partial text

I posted the incorrect files yesterday and apologize. I also modified the awk script but with no luck. There are two text files in the zip (name.txt and output.txt). I am trying to match $2 in name.txt with $1 in output.txt and if they match then $1 of name.txt is copied to $7 of output.txt. The tricky part (well at least for me is), that only part of $2 will match $1. Thank you Smilie.

Code:
awk 'NR==FNR{A[$1]=$2; next}  A[$2]  {$2=$2 " " A[$4]}1' output.txt name.txt > output.txt


HTML Code:
 Desired output.txt 
DTE3504500000001ref	34529	35031	1	DTE3504500000001	SeqRxn4  1
DTE3504500000001antiref	35031	34529	1
# 2  
Old 03-07-2015
Both files are in DOS format, not in Unix format.

Could you describe what should be matched? Apparently DTE3504500000001ref should match DTE3504500000001, but DTE3504500000001antiref should not. What is the criterion?
# 3  
Old 03-07-2015
Sorry the correct files are attached. The DTE3504500000001 is the criterion to match so that both records will be assigned the same value. Also, the final output.txt needs to be delimiated so it can be opened in excel. I am not sure where to put the
Code:
 OFS="\t"

. Thank you Smilie.
# 4  
Old 03-07-2015
Try
Code:
awk 'NF == 2 {T[$2]=$1; next} {print $0, T[$5]}' FS="\t" OFS="\t" /tmp/name.txt /tmp/output.txt

Might be lengthy to read ALL the data from name into memory...
# 5  
Old 03-07-2015
Quote:
Originally Posted by cmccabe
Sorry the correct files are attached. The DTE3504500000001 is the criterion to match so that both records will be assigned the same value. Also, the final output.txt needs to be delimiated so it can be opened in excel. I am not sure where to put the
Code:
 OFS="\t"

. Thank you Smilie.
Then in your example, why does only the record with DTE3504500000001ref get an extra 1 at the end and why doesn't the one with DTE3504500000001antiref get one?

Code:
DTE3504500000001ref	34529	35031	1	DTE3504500000001	SeqRxn4  1
DTE3504500000001antiref	35031	34529	1

# 6  
Old 03-09-2015
Thank you very much. I attached the combined.txt but forgot that $5 needs to be copied to $8 and the blank row in between the two lines removed. The combined.txt really only needs to loo like the below, but I'm not sure how to do this. Basically, I am going to importing the sheet into a SQL database and trying to format the data accordinly, the values in the sheet are combined (..... ref and .....antiref) and the chromosome is matched. Thank you very much Smilie.

Code:
Code:
 awk 'NF == 2 {T[$2]=$1; next} {print $0, T[$5]}' FS="\t" OFS="\t" name.txt output.txt > combined.txt

HTML Code:
1     DTE3504500000001
1     DTE3504500000002 
1     DTE3504500000003
# 7  
Old 03-09-2015
You lost me. Does it work? Doesn't it? What's missing?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Rename file using partial match to another

In the below I am trying to rename the contents within each data subfolder in a specific run, based on a partial match of the IonCode_0000_ in each file in the data subdirectory to $1 in f1. There will be multiple runs in f1 but each run in $uniq is unique and will be found in f1 and the rename... (27 Replies)
Discussion started by: cmccabe
27 Replies

2. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

3. Shell Programming and Scripting

Partial Match and Replace

Hi, I have a tab delimited text file like this one. I need to do a partial match of a particular cell and then replace matches with an empty cell. So here is a sample: Smith FordMustang ChevroletCamaro Miller FordFiesta Jones KiaSorrento Davis ChevroletCamaro Johnson ToyotaHighlander I... (4 Replies)
Discussion started by: mikey11415
4 Replies

4. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Trying to get the unique count of the below input, but if the text in beginning of $5 is a partial match to another line in the file then it is not unique. awk awk '!seen++ {n++} END {print n}' input 7 input chr1 159174749 159174770 chr1:159174749-159174770 ACKR1 chr1 ... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

awk partial string match and add specific fields

Trying to combine strings that are a partial match to another in $1 (usually below it). If a match is found than the $2 value is added to the $2 value of the match and the $3 value is added to the $3 value of the match. I am not sure how to do this and need some expert help. Thank you :). file ... (2 Replies)
Discussion started by: cmccabe
2 Replies

6. UNIX for Dummies Questions & Answers

How to substitute for the partial match?

Hi I have a question and hope I can get answer here. Thank you in advance. I have two files: file1: aa X bb Y cc Z file2: cc A bb B dd C aa D bb E If the 1st column match in both file1 and file2, the 2nd column in file2 will be replaced by the 2nd column in file1. If there is no... (2 Replies)
Discussion started by: yuejian
2 Replies

7. UNIX for Dummies Questions & Answers

Partial match in two files then substitute

Hi, I was trying to figure this out but failed so I hope someone here can help me, thank you in advance. I have two files. file1: aa M bb N cc O dd P ee Q file2: aa A_87_P254063 cc A_87_P016532 bb A_87_P104793 dd A_87_P055331 ee A_87_P059706 aa A_87_P071636 ee A_87_P028302... (2 Replies)
Discussion started by: yuejian
2 Replies

8. Shell Programming and Scripting

awk/sed to extract column bases on partial match

Hi I have a log file which has outputs like the one below conn=24,196 op=1 RESULT err=0 tag=0 nentries=9 etime=3,712 dbtime=0 mem=486,183,328/2,147,483,648 Now most of the time I am only interested in the time ( the first column) and a column that begins with etime i.e... (8 Replies)
Discussion started by: pkabali
8 Replies

9. Shell Programming and Scripting

Using grep returns partial matches, I need to get an exact match or nothing

I’m trying to modify someone perl script to fix a bug. The piece of code checks that the zone name you want to add is unique. However, when the code runs, it finds a partial match using grep, and decides it already exists, so the “create” command exits. $cstatus = `${ZADM} list -vic | grep... (3 Replies)
Discussion started by: TKD
3 Replies

10. Shell Programming and Scripting

awk partial match and filter records

Hi, I am having file which contains around 15 columns, i need to fetch column 3,12,14 based on the condition that column 3 starts with 40464 this is the sample data how to achieve that (3 Replies)
Discussion started by: aemunathan
3 Replies
Login or Register to Ask a Question