Sponsored Content
Top Forums Shell Programming and Scripting Number of matches in 2 strings Post 302658033 by bartus11 on Tuesday 19th of June 2012 04:02:01 AM
Old 06-19-2012
It is because your actual data is several gigabytes big. You should also check if there is no swapping going on when running that script, because then it would really take forever to complete. Check it with: vmstat 1 (look for nonzero values in si and so columns.

---------- Post updated 06-19-12 at 03:02 AM ---------- Previous update was 06-18-12 at 02:24 PM ----------

As adviced by drl I modified script a bit to show progress of loading the file:
Code:
#!/usr/bin/perl
open I, $ARGV[0];
print STDERR "Reading columns (one dot is 100000 lines)\n";
while (chomp($line=<I>)) {
  print STDERR "." if $.%100000==0;
  @col_names=split / /,$line if $.==1;
  if ($.>1) {
    @fields=split / /,$line;
    for ($i=0;$i<=$#col_names;$i++) {
      $col[$i].=$fields[$i];
    }
  }
}
for ($i=0;$i<=$#col;$i++) {
  for ($j=$i+1;$j<=$#col;$j++) {
    @chars1=split //, $col[$i];
    @chars2=split //, $col[$j];
    $max=($#chars1>=$#chars2)?$#chars1:$#chars2;
    $matches=0;
    for ($k=0;$k<=$max;$k++) {
      $matches++ if $chars1[$k] eq $chars2[$k];
    }
    print "$col_names[$i] $col_names[$j] $matches\n";
  }
}

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Select matches between line number and end of file?

Hi Guys/Gals, I have a log file that is updated once every few seconds and I am looking for a way to speed up one of my scripts. Basically what I am trying to do is grep through a text file from start to finish once. Then each subsequent grep starts at the last line of the previous grep to... (4 Replies)
Discussion started by: Jerrad
4 Replies

2. Shell Programming and Scripting

Display LineNo Incase Total Number Of Delimiter Does matches in a given variable

I have many files .dat extension. requirement is to display line no if no of delimiter does not matches in a given variable lets say File: REF_BETOS.dat HCPCS_OR_CPT_CODE~BETOS_CODE~TERMINATION_DATE 0001F~Z2~ 0003T~I4~B20061231 0005F~Z2~~~ 0008T~P8~B20061231... (1 Reply)
Discussion started by: ainuddin
1 Replies

3. Shell Programming and Scripting

Get line number when matches a string

If I have a file something like as shown below, ARM*187878*hjhj BAG*88778*jjjj COD*7777*kkkk BAG*87878*kjjhjk DEF*65656*89989*khjkk I need the line numbers to be added with a colon when it matches the string "BAG". Here in my case, I need something like ARM*187878*hjhj... (4 Replies)
Discussion started by: Muthuraj K
4 Replies

4. Shell Programming and Scripting

grep - match files containing minimum number of pattern matches

I want to search a bunch of files and list only those containing a minimum number of pattern matches. So if I want to identify files containing 3 (or more) instances of the pattern "said:" and I have file1 that contains the lines: He said: She said: and file2 that contains the lines: He... (3 Replies)
Discussion started by: stumpyuk
3 Replies

5. Shell Programming and Scripting

Help in printing n number of lines if a search string matches in a file

Hi I have below script which is used to grep specific errors and if error string matches send an email alert. Script is working fine , however , i wish to print next 10 lines of the string match to get the details of error in the email alert Current code:- #!/bin/bash tail -Fn0 --retry... (2 Replies)
Discussion started by: neha0785
2 Replies

6. Shell Programming and Scripting

Count number of pattern matches per line for all files in directory

I have a directory of files, each with a variable (though small) number of lines. I would like to go through each line in each file, and print the: -file name -line number -number of matches to the pattern /comp/ for each line. Two example files: cat... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

7. Shell Programming and Scripting

Exclude lines in a file with matches with multiple Strings using egrep

Hi I have a txt file and I would like to use egrep without using -v option to exclude the lines which matches with multiple Strings. Let's say I have some text in the txt file. The command should not fetch lines if they have strings something like CAT MAT DAT The command should fetch me... (4 Replies)
Discussion started by: Sathwik
4 Replies

8. Shell Programming and Scripting

Number of matches and matched pattern(s) in awk

input: !@#$%2QW5QWERTAB$%^&* The string above is not separated (or FS=""). For clarity sake one could re-write the string by including a "|" as FS as follow: !|@|#|$|%|2QW|5QWERT|A|B|$|%|^|&|* Here, I am only interested in patterns (their numbers are variable between records) containing... (16 Replies)
Discussion started by: beca123456
16 Replies

9. Shell Programming and Scripting

Print line if values in fields matches number and text

datafile: 2017-03-24 10:26:22.098566|5|'No Route for Sndr:RETEK RMS 00040 /ZZ Appl:PF Func:PD Txn:832 Group Cntr:None ISA CntlNr:None Ver:003050 '|'2'|'PFI'|'-'|'EAI_ED_DeleteAll'|'EAI_ED'|NULL|NULL|NULL|139050594|ActivityLog| 2017-03-27 02:50:02.028706|5|'No Route for... (7 Replies)
Discussion started by: SkySmart
7 Replies

10. Shell Programming and Scripting

Replace all string matches in file with unique random number

Hello Take this file... Test01 Ref test Version 01 Test02 Ref test Version 02 Test66 Ref test Version 66 Test99 Ref test Version 99 I want to substitute every occurrence of Test{2} with a unique random number, so for example, if I was using sed, substitution would be something... (1 Reply)
Discussion started by: funkman
1 Replies
SIZE(1) 						      General Commands Manual							   SIZE(1)

NAME
size - print the size of the sections in an object file SYNOPSIS
size [ option ... ] [ object ... ] DESCRIPTION
Size (without the -m option) prints the (decimal) number of bytes required by the __TEXT, __DATA and __OBJC segments. All other segments are totaled and that size is listed in the `others' column. The final two columns is the sum in decimal and hexadecimal. If no file is specified, a.out is used. The options to size(1) are: - Treat the remaining arguments as name of object files not options to size(1). -m Print the sizes of the Mach-O segments and sections as well as the total sizes of the sections in each segment and the total size of the segments in the file. -l When used with the -m option, also print the addresses and offsets of the sections and segments. -x When used with the -m option, print the values in hexadecimal (with leading 0x's) rather than decimal. -arch arch_type Specifies the architecture, arch_type, of the file for size(1) to operate on when the file is a fat file. (See arch(3) for the cur- rently know arch_types.) The arch_type can be "all" to operate on all architectures in the file. The default is to display only the host architecture, if the file contains it; otherwise, all architectures in the file are shown. SEE ALSO
otool(1) BUGS
The size of common symbols can't be reflected in any of the numbers for relocatable object files. Apple Computer, Inc. May 23, 2002 SIZE(1)
All times are GMT -4. The time now is 09:37 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy