Sponsored Content
Top Forums UNIX for Dummies Questions & Answers String pattern matching and position Post 302915033 by Don Cragun on Saturday 30th of August 2014 05:47:16 PM
Old 08-30-2014
The awk utility is also only defined to work when the input files it reads are text files. So, although some versions of awk can handle long, and/or incomplete lines or both, many cannot. If you would like something that should work on any UNIX or Linux system, you could try something like this:
Code:
awk '
function p(spot) {
	printf("%10d %10d\n", ++cnt, spot)
}
te && /^[Rr]/ {
	p(te)
}
{	while(match($0, /[Tt][Rr]/)) {
		p(off + RSTART)
		$0 = substr($0, 1, RSTART - 1) " " substr($0, RSTART + 1)
	}
}
{	off += length($0)
	if($0 ~ /[Tt]$/) {
		te = off
	} else	te = 0
}' input.txt

Note that this works on your input file before stripping out the <newline> characters, so instead of having to allocate 200Mb of memory to read in your one-line file, it just needs to read one ~50 character line at a time.

With the following randomly generated list of upper- and lower-case letters (except for the 1st 8 and last 8 characters in the file):
Code:
TrTRtrtRmzGArXRqWdKOmxzDWLKZVnPRRrAVNcpAflTxvLkLbs
NbZdBuopHQnEqVJiLWYHVZUfHLqUTmRPesoqVbVdgXXglCCEQC
ZRfvLdXyfgpufseFnIIboRbtDXtlttNQudyeOGyLvLGzSOPyMo
VpxGVwNJKXpYUlhZuNgIcgYuscJRzmExrJZWeeRgnHXwxkxbKh
mndPLikztEWtlovWaOddGCSEijRtrkgWWzvQADIQhsfVEAwmXQ
eIImjmJnvLTQLubbchEwLclnjVmUKuIRxmUOSmarnWYyEBKQpX
gEpdrIXIXiUsiMjQQWWIYWYCfSBwMsPQwvLHyGRwKldfvOxzar
xgwKodWiJxgAhVhlCfalWRpijwiHRlYntBOxweZrvwPPLTYpmN
REPdLIcZnBLWORUkpLCBtlTzjOmQBDVuFEAYfzLTIbyZaNVUMt
rfDzbKDxzXoCqnpWntyTrkyIrSrZTopjapZFouHDGxmlZmxswW
AcvPaJKxLSXZLCLfRZVuxusjYcKzlpZajBMvweadarCAIGjPdM
yiFAqrMDySoxpPREnFPHDQaFJDVUDsYXmbZGkhbvImOkCKfAsg
kauwlSzzrbqrBrXCLJsHXlHAdoRBjXjQbUoFJslyENNKnjIADT
RMEZvOSLWqHeeEoIUddxBxdHuuEMqTpYVTIoGUNVPxKPcSadji
ecsIoISmpwIPIqCXYdwqsvbtTKuoQflREDkZPLxtlyfOVeuKxj
LkwARhocaWFEMjZlPHtuCiYmxfqtYSGwlRSLZHzYGDZoHzvJbm
GsXLsRcuvLEQcXPRakbdeHGLrrnZgwyMFHmXNMmNNbEnfkXumM
pUSpOhpTakWOpQNohhjcuObfSfteNBMyJivKQKhPJQtrtRTrTR

it produces the output:
Code:
         1          1
         2          3
         3          5
         4          7
         5        228
         6        450
         7        470
         8        650
         9        893
        10        895
        11        897
        12        899

giving you the number of matches found and their positions in the file (not counting <newline> characters).
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting a string matching a pattern from a line

Hi All, I am pretty new to pattern matching and extraction using shell scripting. Could anyone please help me in extracting the word matching a pattern from a line in bash. Input Sample (can vary between any of the 3 samples below): 1) Adaptec SCSI RAID 5445 2) Adaptec SCSI 5445S RAID 3)... (8 Replies)
Discussion started by: jharish
8 Replies

2. Shell Programming and Scripting

Find the position of lines matching string

I have a file with the below format, GS*8***** ST*1******** A* B* E* RMR*123455(This is the unique number to locate this row) F* SE*1*** GE** GS*9***** ST*2 H* J* RMR*567889(This is the unique number to locate this row) L* SE* GE***** (16 Replies)
Discussion started by: Muthuraj K
16 Replies

3. Shell Programming and Scripting

Get matching string pattern from a file

Hi, file -> temp.txt cat temp.txt /home/pradeep/123/a_asp.html /home/pradeep/123/a_asp1.html /home/pradeep/435/a_asp2.html /home/pradeep/arun/abc/a_dfr.html /home/pradeep/arun/123/a_kir.html /home/pradeep/123/arun/a_dir.html .... .... .. i need to get a_*.html(bolded strings... (4 Replies)
Discussion started by: pradebban
4 Replies

4. Shell Programming and Scripting

Fetching string after matching pattern from last

I have a file a file having entries are like @ram@sham@sita @krishan@kumar @deep@kumar@hello@sham in this file all line are having different no of pattern-@. need to fetch the substring after the last pattern. like sita kumar sham thanks in advance (3 Replies)
Discussion started by: saluja.deepak
3 Replies

5. UNIX for Dummies Questions & Answers

Extracting sub-string matching the pattern.

Hi, I have a string looks like the following: USERS 32767.9844 UNDOTBS1 32767.9844 SYSAUX 32767.9844 SYSTEM 32767.9844 EMS 8192 EMS 8192 EMS_INDEXES 4096 EMS_INDEXES 4096 8 rows selected. How do I extract a sub-string to get the expected output as following: EMS 8192 EMS_INDEXES 4096 ... (3 Replies)
Discussion started by: NetBear
3 Replies

6. Shell Programming and Scripting

Problems with Multiple Pattern String Matching

I am facing a problem and I would be grateful if you can help me :wall: I have a list of words like And I have a datafile like the box of the box of tissues out of of tissues out of the book, the the book, the pen and the the pen and the I want to find Patterns of “x.*x” where... (2 Replies)
Discussion started by: A-V
2 Replies

7. Shell Programming and Scripting

sed or awk command to replace a string pattern with another string based on position of this string

here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb cat dump.sql INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Discussion started by: vivek d r
10 Replies

8. Shell Programming and Scripting

PHP - Regex for matching string containing pattern but without pattern itself

The sample file: dept1: user1,user2,user3 dept2: user4,user5,user6 dept3: user7,user8,user9 I want to match by '/^dept2.*/' but don't want to have substring 'dept2:' in output. How to compose such regex? (8 Replies)
Discussion started by: urello
8 Replies

9. Shell Programming and Scripting

Taking out part of a string by matching a pattern

Hi All, My Problem is like below. I have a file which contains just one row and contains data like PO_CREATE12457888888888889SK1234567878744551111111111SK89456321145789955455555SK8888888815788852222 i want to extract SK12345678 SK89456321 SK88888888 So basically SK and next 8... (4 Replies)
Discussion started by: Asfakul Islam
4 Replies

10. Shell Programming and Scripting

Replace String matching wildcard pattern

Hi, I know how to replace a string with another in a file. But, i wish to replace the below string pattern EncryptedPassword="{gafgfa}]\asffafsf312a" i.e EncryptedPassword="<any random string>" To EncryptedPassword="" i.e remove the random password to a empty string. Can you... (3 Replies)
Discussion started by: mohtashims
3 Replies
All times are GMT -4. The time now is 01:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy