Search & Replace regex Perl one liner to AWK one liner


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search & Replace regex Perl one liner to AWK one liner
# 1  
Old 07-05-2011
Search & Replace regex Perl one liner to AWK one liner

Thanks for giving your time and effort to answer questions and helping newbies like me understand awk.

I have a huge file, millions of lines, so perl takes quite a bit of time, I'd like to convert these perl one liners to awk.

Basically I'd like all lines with ISA sandwiched between non-word characters on its own line

then I'd like to remove the first non-word character in front of "sandwiched" ISAs or put another way put "sandwiched" ISAs at the beginning of the line

Code:
perl -pi -e 's/[\W_]ISA[\W_]/\n$&/g' large_file 
perl -pi -e 's/^[\W_]ISA/ISA/g' large_file

How would I do this in awk? Thanks so much for help, I really do appreciate it. Please let me know if I can explain this more clearly or if you need data examples.

Thank you!!!!
# 2  
Old 07-05-2011
Please few sample lines
# 3  
Old 07-06-2011
Hi and thanks Danmero!

Here are a few sample lines ... I only want the lines with red ISA on a new line not the ones in purple ISA ... I know its a bit messy ... I can explain the logic/syntax of the file, if you'd like

Code:
          ISA~00~          ~00~          ~ZZ~SEND  MFG       ~ZZ~RECV MFG       ~110616~2235~U~00200~000003972~0~P~\
GS~FA~SEND  MFG~RECV MFG~20110616~2235~4075~X~004010
ST~997~00001
AK1~SH~4075
AK2~856~000008260
AK5~AISATF
AK9~A~00001~00001~00001
SE~006~00001
GE~00001~4075
IEA~00001~000003972&ISA!00!SEND DATA  !00!SEND DATA  !ZZ!SEND  PDCPO     !ZZ!RECV            !110616!1540!U!00401!000009564!0!P!:
GS!FA!SEND  PDCPO!RECV!20110616!1540!9669!X!004010
ST!997!000021081
AK1!SH!12738
AK9!A!1!1!1
SE!4!000021081
GE!1!9669
IEA~1~00000956`ISA~00~SEND DATA  ~00~SEND DATA  ~ZZ~SEND  PDCPO     ~ZZ~RECV            ~110616~1540~U~00401~000009565~0~P~:>GS~FA~SEND  PDCPO~RECV~20110616~1540~9670~X~004010>ST~997~000021082>AK1~SH~12739>AK9~A~1~1~1>SE~4~000021082>GE~1~9670>IEA~1~000009565

---------- Post updated 07-06-11 at 11:04 AM ---------- Previous update was 07-05-11 at 05:38 PM ----------

Thought I'd add some details on the file.

ISA, GS, ST, AK1, AK2, AK5, AK9, SE, GE, IEA are line headers and generally follow the same order. ISA is the beginning of the record, IEA is the end of the record. There are tens of thousands of records in a given file.

The file also has non-word character field seperators (ie ~ !), it also has line seperators (either a newline or non-word character, later an awk script will change all [\W] to newlines)
# 4  
Old 07-06-2011
Try:
Code:
awk '{gsub("[^a-zA-Z]ISA[^a-zA-Z]","\n&")}1' file

and
Code:
awk '{sub("^[^a-zA-Z]ISA","ISA")}1' file

This User Gave Thanks to bartus11 For This Post:
# 5  
Old 07-06-2011
Thanks so much bartus11 ... I really appreciate your time ... I see where I went wrong initially, I didn't use quotes and used sub instead of gsub

Code:
awk -mr 99999999 '{gsub("[^a-zA-Z0-9]ISA[^a-zA-Z0-9]","\n&")}1' junemthlyob
 >> junemthlyob.a

One of the ISA lines has 3163417 characters and I'm getting this error, do you have any suggestions on how to overcome this?

Code:
awk: gsub() result
 ISA`00`FT too big
 input record number 30369, file /dev/fs/C/UNIX/SFU/USER/IBOBX12/JUNEOB2/junemthlyob
 source line number 1

# 6  
Old 07-06-2011
What system are you using?
# 7  
Old 07-06-2011
Hi Bartus11 ... I'm using Window Services for Unix Interix Korn Shell
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

PERL one liner

hi, I am using PERL one liner for oracle database connection as : $PERL -e "use DBI; DBI->connect(qw(DBI:Oracle:SID user passwd));" is there a way to append select statement to this connection ? i.e. DB connection and select stmt in one line ? how to do sysdba connection using one lines... (1 Reply)
Discussion started by: talashil
1 Replies

2. UNIX for Dummies Questions & Answers

What awk 1-liner will replace value in 1stField of a delimited file with the value of '5' ?

Hi, I am a newbie to awk. Here is my problem. Looking for an awk 1-liner to solve it: My Computing Environment: - Solaris10 - I prefer to use csh or sh shells 1. Lets say my input file is File1.dat (delimter = | ) and looks as follows: (File1.dat) ... (1 Reply)
Discussion started by: andy b
1 Replies

3. UNIX for Dummies Questions & Answers

Perl one liner to replace text

Not quite a unix question but problem in a perl command. Taking a chance if someone knows about the error cat 1 a b c d perl -p -e 's/a/b/g' 1 b b c d What is the problem here?? perl -p -i -e 's/a/b/g' 1 Can't remove 1: Text file busy, skipping file. (2 Replies)
Discussion started by: analyst
2 Replies

4. Shell Programming and Scripting

replace awk with a perl one liner (REGEXP and FS)

hello, I want to replace awk with a perl one liner in unix. i use in awk REGEX and FS ( field separator) because awk syntaxes in different unix os versions have not the same behaviour. Awk, Nawk and GNU Awk Cheat Sheet - good coders code, great reuse i have a file named "file" and want... (5 Replies)
Discussion started by: bora99
5 Replies

5. UNIX for Dummies Questions & Answers

One liner pattern search with awk/sed/grep

I have an array containing bunch of characters. I have to check this array for specific character and if "Not Found than" use a goto statement to go to USAGE set options = (A B C D E F) @ i = 0 while ($i <= ${#options}) if ($options != "F" || $options != "D") then goto USAGE endif @... (1 Reply)
Discussion started by: dixits
1 Replies

6. Shell Programming and Scripting

Replacing Awk with One-liner Perl

can someone help me translate the following command, from: /usr/bin/awk "/^$TOFDAYM $TOFDAYD /,0" $LOGFILE to something like perl -e ..... basically, i want to use perl to do awk functions within a shell script. i want to do the above awk, using perl. any suggestions? (9 Replies)
Discussion started by: SkySmart
9 Replies

7. Shell Programming and Scripting

Need an awk / sed / or perl one-liner to remove last 4 characters with non-unique pattern.

Hi, I'm writing a ksh script and trying to use an awk / sed / or perl one-liner to remove the last 4 characters of a line in a file if it begins with a period. Here is the contents of the file... the column in which I want to remove the last 4 characters is the last column. ($6 in awk). I've... (10 Replies)
Discussion started by: right_coaster
10 Replies

8. Shell Programming and Scripting

awk: Multiple search patterns & print in an one liner

I would like to print result of multiple search pattern invoked from an one liner. The code looks like this but won't work gawk -F '{{if ($0 ~ /pattern1/) pat1=$1 && if ($0 ~ /pattern2/) pat2=$2} ; print pat1, pat2}' Can anybody help getting the right code? (10 Replies)
Discussion started by: sdf
10 Replies

9. Shell Programming and Scripting

awk multiple-line search and replace one-liner

Hi I am trying to search and replace a multi line pattern in a php file using awk. The pattern starts with <div id="navbar"> and ends with </div> and spans over an unknown number of lines. I need the command to be a one liner. I use the "record separator" like this : awk -v... (8 Replies)
Discussion started by: louisJ
8 Replies

10. Shell Programming and Scripting

awk/perl one-liner assist

In a ~4GB file there are lines like, 13.13.4.3 Googe.com - Jan/23/2011:00:00:00 +0000 "URL Google HTTP/1.1" 45 56 208 - "http://www.gogle.com/webhp?hl=en&tab=nw#hl=en&source=hp&biw=1366&bih=667&q=hello&aq=f&aqi=&aql=&oq=&fp=c432485467934a89" ".Net; Fox" - 13.145.3.3 Goge.com -... (3 Replies)
Discussion started by: gameboy87
3 Replies
Login or Register to Ask a Question