Sponsored Content
Top Forums Shell Programming and Scripting awk regexp to print repetitive pattern Post 302977321 by yifangt on Friday 15th of July 2016 07:29:17 PM
Old 07-15-2016
Thanks Don and all!
What I was doing is to combined two files (each half million lines) by a matching column, if no match (0.01%), make up the missing fields with the "-", which triggered me asking if there is regex for my purpose:
Code:
$ cat file2
XLOC_000001 TCONS_00000001 LOC_Os02g39790.2 47.273 55 24 2 3855 4016 335 385 3.60e-04  
XLOC_000001 TCONS_00000002 LOC_Os02g39790.2 47.273 55 24 2 2368 2529 335 385 2.35e-04  
XLOC_000001 TCONS_00000007 LOC_Os02g39790.2 47.273 55 24 2 3553 3714 335 385 3.09e-04  
XLOC_000001 TCONS_00000009 LOC_Os02g39790.2 47.273 55 24 2 5083 5244 335 385 2.83e-04  
XLOC_000001 TCONS_00000011 LOC_Os02g39790.2 47.273 55 24 2 2200 2361 335 385 2.28e-04   
$ cat file1
TCONS_00000001    W5GKA3_WHEAT
TCONS_00000002    W5GKA3_WHEAT
TCONS_00000011    I1IBH3_BRADI
TCONS_00000009    W5GKA3_WHEAT
TCONS_00000005    I1IBH3_BRADI
TCONS_00000006    I1IBH3_BRADI
TCONS_00000007    W5GKA3_WHEAT

Code:
$ awk 'NR==FNR {A[$2]=$0; next}; {if (A[$1]) print $0, A[$1]; else print $0, "-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-" }' file2 file1
TCONS_00000001    W5GKA3_WHEAT XLOC_000001 TCONS_00000001 LOC_Os02g39790.2 47.273 55 24 2 3855 4016 335 385 3.60e-04  
TCONS_00000002    W5GKA3_WHEAT XLOC_000001 TCONS_00000002 LOC_Os02g39790.2 47.273 55 24 2 2368 2529 335 385 2.35e-04  
TCONS_00000011    I1IBH3_BRADI XLOC_000001 TCONS_00000011 LOC_Os02g39790.2 47.273 55 24 2 2200 2361 335 385 2.28e-04   
TCONS_00000009    W5GKA3_WHEAT XLOC_000001 TCONS_00000009 LOC_Os02g39790.2 47.273 55 24 2 5083 5244 335 385 2.83e-04  
TCONS_00000005    I1IBH3_BRADI -    -    -    -    -    -    -    -    -    -    -    -
TCONS_00000006    I1IBH3_BRADI -    -    -    -    -    -    -    -    -    -    -    -
TCONS_00000007    W5GKA3_WHEAT XLOC_000001 TCONS_00000007 LOC_Os02g39790.2 47.273 55 24 2 3553 3714 335 385 3.09e-04

Typing that long string "-\t" repetitively looks dull, and I felt dizzy counting the number of tabs while typing. I made mistakes and need re-run a couple of times to get it correct! So I thought if there is regex I could avoid the mistake.
Thanks again!

Last edited by yifangt; 07-15-2016 at 08:32 PM.. Reason: typos
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk how to print if the search pattern contains speace

the data file is as below: > cat master.cnf /usr| location for usr|5 /src/ver1| version 1 |10 /src/ver2/log| ver 2 log |25 /src/sys/apps/log| Application log for sys|36 /src/sys/apps/conf| configuration location for app|45 /src/sys/apps/bin| binary location app|55my script is as below: ... (1 Reply)
Discussion started by: McLan
1 Replies

2. UNIX for Dummies Questions & Answers

print the line immediately after a regexp; but regexp is a sentence

Good Day, Im new to scripting especially awk and sed. I just would like to ask help from you guys about a sed command that prints the line immediately after a regexp, but not the line containing the regexp. sed -n '/regexp/{n;p;}' filename What if my regexp is 3 word or a sentence. Im... (3 Replies)
Discussion started by: ownins
3 Replies

3. Shell Programming and Scripting

Perl Repetitive Pattern Matching

Problem: GIVEN ======= my $sql="INSERT INTO table_nm(a, b, b, d, e, f , g) VALUES (2046, TODAY, 'Change Subscription Name', '00000000000002000000000000000000000000000000000000', '00000000000001000000000000000000000000000000000000', '00000000000000000000000000000000000000000000000000', 1);... (2 Replies)
Discussion started by: Niroj
2 Replies

4. Shell Programming and Scripting

Use to awk to match pattern, and print the pattern

Hi, I know how to use awk to search some expressions like five consecutive numbers, , this is easy. However, how do I make awk print the pattern that is been matched? For example: input: usa,canada99292,japan222,france59664,egypt223 output:99292,59664 (6 Replies)
Discussion started by: grossgermany
6 Replies

5. Shell Programming and Scripting

Print lines between two repetitive patterns

Hi users I have one file which has number of occurrence of one pattern examples Adjustmenttype,11 xyz 10 dwe 9 abd 13 def 14 Adjustmenttype,11 xyz 24 dwe 34 abd 35 def 11 nmb 12 Adjustmenttype, not eleven .... ... ... (2 Replies)
Discussion started by: eranmoh
2 Replies

6. Shell Programming and Scripting

print pattern between two variables awk sed

I am trying to print text between two variables in a file I have tried the following things but none seem to work: awk ' /'$a'/ {flag=1;next} /'$b'/{flag=0} flag { print }' file and also sed "/$a/,/$b/p" file But none seem to work Any Ideas? Thanks in Advance (5 Replies)
Discussion started by: forumbaba
5 Replies

7. Shell Programming and Scripting

AWK: Grep Pattern and print help

I wanted to get outcome from a big file with pattern quoted: Line FSP LSP SR RL Test1 100 300 4 4000 Test2 1 300 2 300 Any help is greatly appreciated. Thank you. (15 Replies)
Discussion started by: rtsiahaan
15 Replies

8. Shell Programming and Scripting

awk, sed or perl regexp to print values from file

Hello all According to the following file (orignal one contains 200x times the same structure...) I was wondering if someone could help me to print <byte>??</byte> values example, running this script/command like ./script.sh xxapp I would expect as output: 102 116 112 ./script.sh xxapp2... (2 Replies)
Discussion started by: cabrao
2 Replies

9. Shell Programming and Scripting

Help with using awk to print pattern/occurence

Hi, Do anybody know how to use awk to count the pattern at specific column? Input file M2A928K 419 ath-miR159a,gma-miR159a-3p,ptc-miR159a 60 miR235a . . Output file M2A928K 419 ath-miR159a,gma-miR159a-3p,ptc-miR159a 60 miR235a 3 . . I plan to count how many "miR" in column 3... (2 Replies)
Discussion started by: cpp_beginner
2 Replies

10. UNIX for Beginners Questions & Answers

awk or sed to print the character from the previous line after the regexp match

Hi All, I need to print the characters in the previous line just before the regular expression match Please have a look at the input file as attached I need to match the regular expression ^ with the character of the previous like and also the pin numbers and the output file should be like... (6 Replies)
Discussion started by: kshitij
6 Replies
iconv_mac_cyr(5)					Standards, Environments, and Macros					  iconv_mac_cyr(5)

NAME
iconv_mac_cyr - code set conversion tables for Macintosh Cyrillic DESCRIPTION
The following code set conversions are supported: +---------------------------------------------------------------------+ | Code Set Conversions Supported | +--------------+--------+--------------+--------+---------------------+ | Code |Symbol |Target Code |Symbol | Target | +--------------+--------+--------------+--------+---------------------+ |Output | | | | | +--------------+--------+--------------+--------+---------------------+ |Mac Cyrillic |mac |ISO 8859-5 |iso5 | ISO 8859-5 Cyrillic | +--------------+--------+--------------+--------+---------------------+ |Mac Cyrillic |mac |KOI8-R |koi8 | KOI8-R | +--------------+--------+--------------+--------+---------------------+ |Mac Cyrillic |mac |PC Cyrillic |alt | Alternative PC | +--------------+--------+--------------+--------+---------------------+ |Cyrillic | | | | | +--------------+--------+--------------+--------+---------------------+ |Mac Cyrillic |mac |MS 1251 |win5 | Windows Cyrillic | +--------------+--------+--------------+--------+---------------------+ CONVERSIONS
The conversions are performed according to the following tables. All values in the tables are given in octal. Mac Cyrillic to ISO 8859-5 For the conversion of Mac Cyrillic to ISO 8859-5, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | Conversions Performed | | Mac Cyrillic ISO 8859-5 Mac Cyrillic ISO 8859-5 | |24 4 276 252 | |200 260 277 372 | |201 261 300 370 | |202 262 301 245 | |203 263 302-311 40 | |204 264 312 240 | |205 265 313 242 | |206 266 314 362 | |207 267 315 254 | |210 270 316 374 | |211 271 317 365 | |212 272 320-327 40 | |213 273 330 256 | |214 274 331 376 | |215 275 332 257 | |216 276 333 377 | |217 277 334 360 | |220 300 335 241 | |221 301 336 361 | |222 302 337 357 | |223 303 340 320 | |224 304 341 321 | |225 305 342 322 | |226 306 343 323 | |227 307 344 324 | |230 310 345 325 | |231 311 346 326 | |232 312 347 327 | |233 313 350 330 | |234 314 351 331 | |235 315 352 332 | |236 316 353 333 | |237 317 354 334 | |240-246 40 355 335 | |247 246 356 336 | |250-252 40 357 337 | |253 242 360 340 | |254 362 361 341 | |255 40 362 342 | |256 243 363 343 | |257 363 364 344 | |260-263 40 365 345 | |264 366 366 346 | |265-266 40 367 347 | |267 250 370 350 | |270 244 371 351 | |271 364 372 352 | |272 247 373 353 | |273 367 374 354 | |274 251 375 355 | |275 371 376 356 | |375 370 | +-----------------------------------------------------------------+ Mac Cyrillic to KOI8-R For the conversion of Mac Cyrillic to KOI8-R, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | Mac Cyrillic | KOI8-R | Mac Cyrillic | KOI8-R | |24 | 4 |276 |272 | |200 | 341 |277 |252 | |201 | 342 |300 |250 | |202 | 367 |301 |265 | |203 | 347 |302-311 |40 | |204 | 344 |312 |240 | |205 | 345 |313 |261 | |206 | 366 |314 |241 | |207 | 372 |315 |274 | |210 | 351 |316 |254 | |211 | 352 |317 |245 | |212 | 353 |320-327 |40 | |213 | 354 |330 |276 | |214 | 355 |331 |256 | |215 | 356 |332 |277 | |216 | 357 |333 |257 | |217 | 360 |334 |260 | |220 | 362 |335 |263 | |221 | 363 |336 |243 | |222 | 364 |337 |321 | |223 | 365 |340 |301 | |224 | 346 |341 |302 | |225 | 350 |342 |327 | |226 | 343 |343 |307 | |227 | 376 |344 |304 | |230 | 373 |345 |305 | |231 | 375 |346 |326 | |232 | 377 |347 |332 | |233 | 371 |350 |311 | |234 | 370 |351 |312 | |235 | 374 |352 |313 | |236 | 340 |353 |314 | |237 | 361 |354 |315 | |240-246 | 40 |355 |316 | |247 | 266 |356 |317 | |250-252 | 40 |357 |320 | |253 | 261 |360 |322 | |254 | 241 |361 |323 | |255 | 40 |362 |324 | |256 | 262 |363 |325 | |257 | 242 |364 |306 | |260-263 | 40 |365 |310 | |264 | 246 |366 |303 | |265-266 | 40 |367 |336 | |267 | 270 |370 |333 | |270 | 264 |371 |335 | |271 | 244 |372 |337 | |272 | 267 |373 |331 | |273 | 247 |374 |330 | |274 | 271 |375 |334 | |275 | 251 |376 |300 | |375 | 370 | | | +---------------+----------------+----------------+---------------+ Mac Cyrillic to PC Cyrillic For the conversion of Mac Cyrillic to PC Cyrillic, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | Mac Cyrillic | PC Cyrillic | Mac Cyrillic | PC Cyrillic | |24 | 4 |355 |255 | |240-334 | 40 |356 |256 | |335 | 360 |357 |257 | |336 | 361 |360 |340 | |337 | 357 |361 |341 | |340 | 240 |362 |342 | |341 | 241 |363 |343 | |342 | 242 |364 |344 | |343 | 243 |365 |345 | |344 | 244 |366 |346 | |345 | 245 |367 |347 | |346 | 246 |370 |350 | |347 | 247 |371 |351 | |350 | 250 |372 |352 | |351 | 251 |373 |353 | |352 | 252 |374 |354 | |353 | 253 |375 |355 | |354 | 254 |376 |356 | |303 | 366 | | | +---------------+----------------+----------------+---------------+ Mac Cyrillic to MS 1251 For the conversion of Mac Cyrillic to MS 1251, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | Mac Cyrillic | MS 1251 | Mac Cyrillic | MS 1251 | |24 | 4 |255 |40 | |200 | 300 |256 |201 | |201 | 301 |257 |203 | |202 | 302 |260-263 |40 | |203 | 303 |264 |263 | |204 | 304 |266 |264 | |205 | 305 |267 |243 | |206 | 306 |270 |252 | |207 | 307 |271 |272 | |210 | 310 |272 |257 | |211 | 311 |273 |277 | |212 | 312 |274 |212 | |213 | 313 |275 |232 | |214 | 314 |276 |214 | |215 | 315 |277 |234 | |216 | 316 |300 |274 | |217 | 317 |301 |275 | |220 | 320 |302 |254 | |221 | 321 |303-306 |40 | |222 | 322 |307 |253 | |223 | 323 |310 |273 | |224 | 324 |311 |205 | |225 | 325 |312 |240 | |226 | 326 |313 |200 | |227 | 327 |314 |220 | |230 | 330 |315 |215 | |231 | 331 |316 |235 | |232 | 332 |317 |276 | |233 | 333 |320 |226 | |234 | 334 |321 |227 | |235 | 335 |322 |223 | |236 | 336 |323 |224 | |237 | 337 |324 |221 | |240 | 206 |325 |222 | |241 | 260 |326 |40 | |242 | 245 |327 |204 | |243 | 40 |330 |241 | |244 | 247 |331 |242 | |245 | 267 |332 |217 | |246 | 266 |333 |237 | |247 | 262 |334 |271 | |250 | 256 |335 |250 | |252 | 231 |336 |270 | |253 | 200 |337 |377 | |254 | 220 |362 |324 | +---------------+----------------+----------------+---------------+ FILES
/usr/lib/iconv/*.so conversion modules /usr/lib/iconv/*.t conversion tables /usr/lib/iconv/iconv_data list of conversions supported by conversion tables SEE ALSO
iconv(1), iconv(3C), iconv(5) SunOS 5.10 18 Apr 1997 iconv_mac_cyr(5)
All times are GMT -4. The time now is 09:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy