Sponsored Content
Top Forums Shell Programming and Scripting Print the overlapping entries in 2 files to separate file Post 302885635 by MadeInGermany on Monday 27th of January 2014 03:26:49 PM
Old 01-27-2014
The following seems to perform a correct merge.
The overlap function has -le (less or equal) that means "equal boundaries is an overlap". Otherwise must be -lt (less than).
I left some debug code in. If you think it behaves wrongly, activate the debug=echo and disable the debug=:.
Code:
#!/bin/sh
#
set -f # no globbing
#
#debug=echo
debug=:
#
# $1=lo[file1]
# $2=hi[file1]
# $3=lo[file2]
# $4=hi[file2]
overlap(){
[ $1 -le $4 -a $4 -le $2 -o $3 -le $2 -a $2 -le $4 ]
}
#
{
readfrom=1
while :
do
  if [ $readfrom -eq 1 ]
  then
    read line || {
      readfrom=2
      read line <&3 || break
    }
  else
    read line <&3 || {
      readfrom=1
      read line || break
    }
  fi
  set -- $line
  if [ -n "$hip" ]
  then
    if overlap $lop $hip $2 $3
    then
      echo "$line"
      lop=$2; hip=$3
    else
$debug no overlap $lop,$hip $2,$3
      if [ -n "$saved" ]
      then
$debug restore $save
        saved=""
        set -- $save
        overlap $lop $hip $2 $3 || echo ""
        echo "$save"
        lop=$2; hip=$3
      else
        if [ $readfrom -eq 1 ]
        then
          readfrom=2
        else
          readfrom=1
        fi
$debug change to input$readfrom
      fi
$debug save $line
      save=$line
      saved=1
    fi
  else
    echo "$line"
    lop=$2; hip=$3
  fi
done
if [ -n "$saved" ]
then
$debug restore $save
  set -- $save
  overlap $lop $hip $2 $3 || echo ""
  echo "$save"
fi
} <input1 3<input2

This User Gave Thanks to MadeInGermany For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Break a file into separate files

Hello I am facing a scenario where I have a file with XML content and I am running shell script over it. But the problem is the XML is getting updated with new services. In the below scenario, my script takes values from the xml file from one service name say ABCD. Since there are multiple, it is... (8 Replies)
Discussion started by: chiru_h
8 Replies

2. Shell Programming and Scripting

extract nth line of all files and print in output file on separate lines.

Hello UNIX experts, I have 124 text files in a directory. I want to extract the 45678th line of all the files sequentialy by file names. The extracted lines should be printed in the output file on seperate lines. e.g. The input Files are one.txt, two.txt, three.txt, four.txt The cat of four... (1 Reply)
Discussion started by: yogeshkumkar
1 Replies

3. Shell Programming and Scripting

awk/sed script to print each line to a separate named file

I have a large 3479 line .csv file, the content of which looks likes this: 1;0;177;170;Guadeloupe;x 2;127;171;179;Antigua and Barbuda;x 3;170;144;2;Umpqua;x 4;170;126;162;Coos Bay;x ... 1205;46;2;244;Unmak Island;x 1206;47;2;248;Yunaska Island;x 1207;0;2;240;north sea;x... (5 Replies)
Discussion started by: kalelovil
5 Replies

4. Shell Programming and Scripting

awk print header as text from separate file with getline

I would like to print the output beginning with a header from a seperate file like this: awk 'BEGIN{FS="_";print ((getline < "header.txt")>0)} { if (! ($0 ~ /EL/ ) print }" input.txtWhat am i doing wrong? (4 Replies)
Discussion started by: sdf
4 Replies

5. UNIX for Dummies Questions & Answers

Awk: Print out overlapping chunks of file - rows 0-20,10-30,20-40 etc.

First time poster, but the forum has saved my bacon more times than... Lots. Anyway, I have a text file, and wanted to use Awk (or any other sensible program) to print out overlapping sections, or arbitrary length. To describe by example, for file 1 2 3 4 5 etc... I want the out put... (3 Replies)
Discussion started by: matfald
3 Replies

6. UNIX for Dummies Questions & Answers

Merge two files with non-overlapping identities

Hi All, I wish to merge two files: file1: with header rsSNP-ID Chromosome Chr-Pos rs171 1 175261679 rs242 1 20869461 rs538 1 6160958 file2: without header disease:AAT deficiency:M0525101 rs1243168 20109307 1 disease:AAT deficiency:M0525101 rs4900229 20109307 1... (3 Replies)
Discussion started by: luoruicd
3 Replies

7. Shell Programming and Scripting

Compare 2 files and print matches and non-matches in separate files

Hi all, I have two files, chap.txt and complex.txt. chap.txt looks like this: a d l m r k complex.txt looks like this: a c d e l m n j a d l p q r c p r m ......... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

8. Programming

Read text from file and print each character in separate line

performing this code to read from file and print each character in separate line works well with ASCII encoded text void preprocess_file (FILE *fp) { int cc; for (;;) { cc = getc (fp); if (cc == EOF) break; printf ("%c\n", cc); } } int main(int... (1 Reply)
Discussion started by: khaled79
1 Replies

9. Shell Programming and Scripting

Identify the overlapping and non overlapping regions

file1 chr pos1 pos2 pos3 pos4 1)chr1 1000 2000 3000 4000 2)chr1 1380 1480 6800 7800 3)chr1 6700 7700 1200 2200 4)chr2 8500 9500 5670 6670 file2 chr pos1 pos2 pos3 pos4 1)chr2 8500 9500 5000 6000 2)chr1 6700 7700 1200 2200 3)chr1 1380 1480 6700 7700 4)chr1 1000 2000 4900 5900 I... (2 Replies)
Discussion started by: data_miner
2 Replies

10. Shell Programming and Scripting

awk to print line is values between two fields in separate file

I am trying to use awk to find all the $3 values in file2 that are between $2 and $3 in file1. If a value in $3 of file2 is between the file1 fields then it is printed along with the $6 value in file1. Both file1 and file2 are tab-delimited as well as the desired output. If there is nothing to... (4 Replies)
Discussion started by: cmccabe
4 Replies
DIFF3(1)							   User Commands							  DIFF3(1)

NAME
diff3 - compare three files line by line SYNOPSIS
diff3 [OPTION]... MYFILE OLDFILE YOURFILE DESCRIPTION
Compare three files line by line. -e --ed Output unmerged changes from OLDFILE to YOURFILE into MYFILE. -E --show-overlap Output unmerged changes, bracketing conflicts. -A --show-all Output all changes, bracketing conflicts. -x --overlap-only Output overlapping changes. -X Output overlapping changes, bracketing them. -3 --easy-only Output unmerged nonoverlapping changes. -m --merge Output merged file instead of ed script (default -A). -L LABEL --label=LABEL Use LABEL instead of file name. -i Append `w' and `q' commands to ed scripts. -a --text Treat all files as text. -T --initial-tab Make tabs line up by prepending a tab. --diff-program=PROGRAM Use PROGRAM to compare files. -v --version Output version info. --help Output this help. If a FILE is `-', read standard input. AUTHOR
Written by Randy Smith. REPORTING BUGS
Report bugs to <bug-gnu-utils@gnu.org>. COPYRIGHT
Copyright (C) 2002 Free Software Foundation, Inc. This program comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of this program under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING. SEE ALSO
The full documentation for diff3 is maintained as a Texinfo manual. If the info and diff3 programs are properly installed at your site, the command info diff should give you access to the complete manual. diffutils 2.8.1 April 2002 DIFF3(1)
All times are GMT -4. The time now is 08:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy