Sponsored Content
Top Forums Shell Programming and Scripting Using columns from 2 files and extracting string Post 302571016 by radoulov on Saturday 5th of November 2011 01:12:28 PM
Old 11-05-2011
Yes,
that is another bug Smilie
Try swapping p1 == "M" and !c++:

Code:
awk 'NR == FNR {
  c=x
  while (match($6, /[0-9]*[SM]/)) {
    p1 = substr($6, RSTART + RLENGTH - 1, 1)    
    !c++ && p1 == "M" && t[$1, "S"]++ 

    f2[$1, p1, ++t[$1, p1]] = substr($6, RSTART, RLENGTH - 1) 
    $6 = substr($6, RSTART + RLENGTH)

    }    
  f2s1[$1] = ($1, "S", 1) in f2 ? f2[$1, "S", 1] : 0 
  f2s2[$1] = ($1, "S", 2) in f2 ? f2[$1, "S", 2] : 0
  f2m1[$1] = ($1, "M", 1) in f2 ? f2[$1, "M", 1] : 0
  f2m2[$1] = ($1, "M", 2) in f2 ? f2[$1, "M", 2] : 0 
  f2m1s[$1] = substr($10, f2s1[$1] +1 , f2m1[$1])
  f2m2s[$1] = substr($10, length($10) - f2m2[$1] - f2s2[$1] +1 , f2m2[$1])
  f2_4[$1] = $4
  next    
  }
$4 in f2s1 {
  _sub = (f2_4[$4] + f2s1[$4] + f2m1[$4]) > $9 ? f2m1s[$4] : f2m2s[$4]
  print $0, substr(_sub, $8 - $2 + 1, $9 - $8)
  }' f2.txt f1.txt

This is the code with debug statements that I've used:

Code:
awk 'NR == FNR {
  c=x
  #debug
  print "debug:", $6
  while (match($6, /[0-9]*[SM]/)) {
    p1 = substr($6, RSTART + RLENGTH - 1, 1)    
    !c++ && p1 == "M" && t[$1, "S"]++ 

    f2[$1, p1, ++t[$1, p1]] = substr($6, RSTART, RLENGTH - 1) 
    $6 = substr($6, RSTART + RLENGTH)

    }    
  f2s1[$1] = ($1, "S", 1) in f2 ? f2[$1, "S", 1] : 0 
  f2s2[$1] = ($1, "S", 2) in f2 ? f2[$1, "S", 2] : 0
  f2m1[$1] = ($1, "M", 1) in f2 ? f2[$1, "M", 1] : 0
  f2m2[$1] = ($1, "M", 2) in f2 ? f2[$1, "M", 2] : 0 
  f2m1s[$1] = substr($10, f2s1[$1] +1 , f2m1[$1])
  #debug
  for (i in f2)
    print "debug: f2:" i, f2[i]
  print "debug: f2s1[$1]", f2s1[$1]
  print "debug: f2s2[$1]", f2s2[$1]
  print "debug: f2m1[$1]", f2m1[$1]
  print "debug: f2m2[$1]", f2m2[$1]
  f2m2s[$1] = substr($10, length($10) - f2m2[$1] - f2s2[$1] +1 , f2m2[$1])
  print "debug: f2m2s[$1]", f2m2s[$1]
  f2_4[$1] = $4
  next    
  }
$4 in f2s1 {
  _sub = (f2_4[$4] + f2s1[$4] + f2m1[$4]) > $9 ? f2m1s[$4] : f2m2s[$4]
  print $0, substr(_sub, $8 - $2 + 1, $9 - $8)
  }' f2.txt f1.txt

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extracting columns from different files for later merging

Hello! I wan't to extract columns from two files and later combine them for plotting with gnuplot. If the files file1 and file2 look like: fiile1: a, 0.62,x b, 0.61,x file2: a, 0.43,x b, 0,49,x The desired output is a 0.62 0.62 b 0.61 0.49 Thank you in advance! (2 Replies)
Discussion started by: kingkong
2 Replies

2. Shell Programming and Scripting

Extracting a string from one file and searching the same string in other files

Hi, Need to extract a string from one file and search the same in other files. Ex: I have file1 of hundred lines with no delimiters not even space. I have 3 more files. I should get 1 to 10 characters say substring from each line of file1 and search that string in rest of the files and get... (1 Reply)
Discussion started by: mohancrr
1 Replies

3. Shell Programming and Scripting

Append string to columns from 2 files

Hi Having a file as follows file1.txt Date (dd/mm)Time Server IP Error Code =========================================================================== 10/04/2008 10:10 ServerA xxx.xxx.xxx.xxx 6 10/04/2008 10:10 ServerB ... (3 Replies)
Discussion started by: karthikn7974
3 Replies

4. Shell Programming and Scripting

extracting columns from 2 files

Hello, I have 2 files file1 & file2 = a1 b1 a2 b2 a3 b3 ... = c1 d1 c2 d2 c3 d3 ... I need to compare if b(i)=c(j) . i,j=1,2,3,4,... If yes, right a(i) d(j) in output file3 per line (1 Reply)
Discussion started by: newpromo
1 Replies

5. UNIX for Dummies Questions & Answers

Extracting columns from multiple files with awk

hi everyone! I already posted it in scripts, I'm sorry, it's doubled I'd like to extract a single column from 5 different files and put them together in an output file. I saw a similar question for 2 input files, and the line of code workd very well, the code is: awk 'NR==FNR{a=$2; next}... (1 Reply)
Discussion started by: orcaja
1 Replies

6. Shell Programming and Scripting

Extracting columns from multiple files with awk

hi everyone! I'd like to extract a single column from 5 different files and put them together in an output file. I saw a similar question for 2 input files, and the line of code workd very well, the code is: awk 'NR==FNR{a=$2; next} {print a, $2}' file1 file2 I added the file3, file4 and... (10 Replies)
Discussion started by: orcaja
10 Replies

7. Shell Programming and Scripting

extracting columns falling within specific ranges for multiple files

Hi, I need to create weekly files from daily records stored in individual monthly filenames from 1999-2010. my sample file structure is like the ones below: daily record stored per month: 199901.xyz, 199902.xyz, 199903.xyz, 199904.xyz ...199912.xyz records inside 199901.xyz (original data... (4 Replies)
Discussion started by: ida1215
4 Replies

8. Shell Programming and Scripting

Compare columns of multiple files and print those unique string from File1 in an output file.

Hi, I have multiple files that each contain one column of strings: File1: 123abc 456def 789ghi File2: 123abc 456def 891jkl File3: 234mno 123abc 456def In total I have 25 of these type of file. (5 Replies)
Discussion started by: owwow14
5 Replies

9. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

10. Shell Programming and Scripting

Joining files using awk not extracting all columns from File 2

Hello All I'm joining two files using Awk by Left outer join on the file 1 File 1 1 AA 2 BB 3 CC 4 DD File 2 1 IND 100 200 300 2 AUS 400 500 600 5 USA 700 800 900 (18 Replies)
Discussion started by: venkat_reddy
18 Replies
bytes(3pm)						 Perl Programmers Reference Guide						bytes(3pm)

NAME
bytes - Perl pragma to force byte semantics rather than character semantics NOTICE
This pragma reflects early attempts to incorporate Unicode into perl and has since been superseded. It breaks encapsulation (i.e. it exposes the innards of how the perl executable currently happens to store a string), and use of this module for anything other than debugging purposes is strongly discouraged. If you feel that the functions here within might be useful for your application, this possibly indicates a mismatch between your mental model of Perl Unicode and the current reality. In that case, you may wish to read some of the perl Unicode documentation: perluniintro, perlunitut, perlunifaq and perlunicode. SYNOPSIS
use bytes; ... chr(...); # or bytes::chr ... index(...); # or bytes::index ... length(...); # or bytes::length ... ord(...); # or bytes::ord ... rindex(...); # or bytes::rindex ... substr(...); # or bytes::substr no bytes; DESCRIPTION
The "use bytes" pragma disables character semantics for the rest of the lexical scope in which it appears. "no bytes" can be used to reverse the effect of "use bytes" within the current lexical scope. Perl normally assumes character semantics in the presence of character data (i.e. data that has come from a source that has been marked as being of a particular character encoding). When "use bytes" is in effect, the encoding is temporarily ignored, and each string is treated as a series of bytes. As an example, when Perl sees "$x = chr(400)", it encodes the character in UTF-8 and stores it in $x. Then it is marked as character data, so, for instance, "length $x" returns 1. However, in the scope of the "bytes" pragma, $x is treated as a series of bytes - the bytes that make up the UTF8 encoding - and "length $x" returns 2: $x = chr(400); print "Length is ", length $x, " "; # "Length is 1" printf "Contents are %vd ", $x; # "Contents are 400" { use bytes; # or "require bytes; bytes::length()" print "Length is ", length $x, " "; # "Length is 2" printf "Contents are %vd ", $x; # "Contents are 198.144" } chr(), ord(), substr(), index() and rindex() behave similarly. For more on the implications and differences between character semantics and byte semantics, see perluniintro and perlunicode. LIMITATIONS
bytes::substr() does not work as an lvalue(). SEE ALSO
perluniintro, perlunicode, utf8 perl v5.18.2 2013-11-04 bytes(3pm)
All times are GMT -4. The time now is 02:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy