How to quickly substitute pattern within certain range of a huge file?
I have big files (some are >300GB!) that need substitution for some patterns, for example, change Multiple Spaces into Tab. I used this oneliner:
but it seems very slow as the job is still running after 24 hours! In this example, only the first 18 rows need be changed, and the rest is untouched.
Is there any better way to do the job quickly? I'm using GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu) on Linux 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) x86_64 GNU/Linux.
Thanks a lot!
Write a nawk script that will produce the following report:
***FIRST QUARTERLY REPORT***
***CAMPAIGN 2004 CONTRIBUTIONS***
-------------------------------------------------------------------------
NAME PHONE Jan | ... (5 Replies)
I want to print between the range two patterns if a particular pattern is present in between the two patterns. I am new to Unix. Any help would be greatly appreciated.
e.g.
Pattern1
Bombay
Calcutta
Delhi
Pattern2
Pattern1
Patna
Madras
Gwalior
Delhi
Pattern2
Pattern1... (2 Replies)
Hi guys, trying to replace a '#' with a ' ' (space) but only between the brackets '(' and ')'
N="text1#text2#text3(var1#var2#var3)"
N=`echo $N |sed '/(/,/) s/#. //'`
echo $N
Looking for an output of "text1#text2#text3(var1 var2 var3)"
Any ideas? (15 Replies)
I've a file say having
line 1
line 2
(NP
line 3
line 4
line 5)
line 6
I want to combine lines starting from (NP and ending with ) then it will look like
line 1
line 2
(NP line3 line4 line5)
line 6
I tried using sed '/(NP/,/)$/ s/\n/ /' but it's not working. Any help please?
... (8 Replies)
Still trying to get the basics down and I would like a different solution to what I'm currently doing and a better understanding of why it's happening. I've written a simple backup script that tars individual directories and then dumps them to a NFS drive. STDERR is being dumped into a process... (2 Replies)
Hi Experts,
I've issue with the huge file.
My requirement is I need to search a pattern between the 155-156 position and if its match's to 31 or 36 then need to route that to a new separate files.
The main file has around 1459328 line and 2 GB in size. I tired with the below code which take... (9 Replies)
Hi Everyone!
I really appreciate all of your help, I'm learning so much, can't wait until I get good enough to start answering questions!
I have a problem ... from one large file, I'd like to create multiple new files for each pattern block
beginning with /^ISA/
ending with /^IEA/
... (2 Replies)
Hi,
I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each.
Please help me as Split command cannot work here as it might miss tags..
Format of the file is as below
<!--###### ###### START-->... (6 Replies)
Hi all,
I have been searching all over Google but I am unable to find a solution for a particular result that I am trying to achieve.
Consider the following input:
1
2
3
4
5
B4Srt1--Variable-0000
B4Srt2--Variable-1111
Srt
6
7
8
9
10
End (3 Replies)
I have config file like this:
server_name xx opt1 opt2 opt3
suboptions1
#suboptions - disabled
suboptions2 pattern
suboptions3
server_name yy opt1 opt2 opt3
suboptions1 pattern
#suboptions - disabled
suboptions2
So basically I want to extract the server... (1 Reply)
Discussion started by: nemesis911
1 Replies
LEARN ABOUT OPENSOLARIS
regex
regex(1F) FMLI Commands regex(1F)NAME
regex - match patterns against a string
SYNOPSIS
regex [-e] [-v "string"] [pattern template] ...
pattern [template]
DESCRIPTION
The regex command takes a string from the standard input, and a list of pattern / template pairs, and runs regex() to compare the string
against each pattern until there is a match. When a match occurs, regex writes the corresponding template to the standard output and
returns TRUE. The last (or only) pattern does not need a template. If that is the pattern that matches the string, the function simply
returns TRUE. If no match is found, regex returns FALSE.
The argument pattern is a regular expression of the form described in regex(). In most cases, pattern should be enclosed in single quotes
to turn off special meanings of characters. Note that only the final pattern in the list may lack a template.
The argument template may contain the strings $m0 through $m9, which will be expanded to the part of pattern enclosed in ( ... )$0 through
( ... )$9 constructs (see examples below). Note that if you use this feature, you must be sure to enclose template in single quotes so that
FMLI does not expand $m0 through $m9 at parse time. This feature gives regex much of the power of cut(1), paste(1), and grep(1), and some
of the capabilities of sed(1). If there is no template, the default is $m0$m1$m2$m3$m4$m5$m6$m7$m8$m9.
OPTIONS
The following options are supported:
-e Evaluates the corresponding template and writes the result to the standard output.
-v "string" Uses string instead of the standard input to match against patterns.
EXAMPLES
Example 1 Cutting letters out of a string
To cut the 4th through 8th letters out of a string (this example will output strin and return TRUE):
`regex -v "my string is nice" '^.{3}(.{5})$0' '$m0'`
Example 2 Validating input in a form
In a form, to validate input to field 5 as an integer:
valid=`regex -v "$F5" '^[0-9]+$'`
Example 3 Translating an environment variable in a form
In a form, to translate an environment variable which contains one of the numbers 1, 2, 3, 4, 5 to the letters a, b, c, d, e:
value=`regex -v "$VAR1" 1 a 2 b 3 c 4 d 5 e '.*' 'Error'`
Note the use of the pattern '.*' to mean "anything else".
Example 4 Using backquoted expressions
In the example below, all three lines constitute a single backquoted expression. This expression, by itself, could be put in a menu defini-
tion file. Since backquoted expressions are expanded as they are parsed, and output from a backquoted expression (the cat command, in this
example) becomes part of the definition file being parsed, this expression would read /etc/passwd and make a dynamic menu of all the login
ids on the system.
`cat /etc/passwd | regex '^([^:]*)$0.*$' '
name=$m0
action=`message "$m0 is a user"`'`
DIAGNOSTICS
If none of the patterns match, regex returns FALSE, otherwise TRUE.
NOTES
Patterns and templates must often be enclosed in single quotes to turn off the special meanings of characters. Especially if you use the
$m0 through $m9 variables in the template, since FMLI will expand the variables (usually to "") before regex even sees them.
Single characters in character classes (inside []) must be listed before character ranges, otherwise they will not be recognized. For exam-
ple, [a-zA-Z_/] will not find underscores (_) or slashes (/), but [_/a-zA-Z] will.
The regular expressions accepted by regcmp differ slightly from other utilities (that is, sed, grep, awk, ed, and so forth).
regex with the -e option forces subsequent commands to be ignored. In other words, if a backquoted statement appears as follows:
`regex -e ...; command1; command2`
command1 and command2 would never be executed. However, dividing the expression into two:
`regex -e ...``command1; command2`
would yield the desired result.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWcsu |
+-----------------------------+-----------------------------+
SEE ALSO awk(1), cut(1), grep(1), paste(1), sed(1), regcmp(3C), attributes(5)SunOS 5.11 12 Jul 1999 regex(1F)