Sponsored Content
Full Discussion: file comparison
Top Forums Shell Programming and Scripting file comparison Post 302701401 by Don Cragun on Saturday 15th of September 2012 10:38:12 PM
Old 09-15-2012
I may have misunderstood your requirements, but the script I came up with not only skips over the file1 lines with ID 2 and 5, but also the line with ID 3. The difference between the start columns for gene_name test2 is 1500 and you said to keep entries only if the difference between the second column of file 2 and file 1 is < of 1001.
If this doesn't do what you want, maybe it will at least give you something to easily modify to get what you want:
Code:
#!/bin/ksh
awk 'FNR==NR{if(NR != 1) {
                # Save fields from 1st file for comparison with the 2nd file...
                key[$4] = NR
                start[NR] = $2
                end[NR] = $3
        }
        next
}
 {      if(FNR == 1) {
                # Copy the header line to the new file.
                print
                next
        }
        if(!($5 in key)) {
                if(debug) printf("No entry found for key %s: %s\n", $5, $0)
                next
        }
        entry = key[$5]
        diff = $2 > start[entry] ? $2 - start[entry] : start[entry] - $2
        if(diff > 1000) {
                if(debug) printf("Start diffe |%d-%d| > 1000: %s\n",
                        $2, start[entry], $0)
                next
        }
        if($3 > end[entry]) {
                if(debug) printf("End field too big: (%d > %d) %s\n",
                        $3, end[entry], $0)
                next
        }
        # We passed all the tests, add entry to output file.
        print
}' debug=1 file2 file1

When run in debug mode (as specified by the last line of the script above), the output is:
Code:
chr	start		end 		ID	gene_name
chr1	2020		3030		1	test1
Start diffe |900-2000| > 1000: chr1	900		5000		2	test1
Start diffe |5000-3500| > 1000: chr2	5000		8000		3	test2
chr3	6000		12000		4	test3
End field too big: (15000 > 12000) chr3	6000		15000		5	test3

When run with debugging turned off and output redirected to file3 by chaning the last line of the script from:
Code:
}' debug=1 file2 file1

to:
Code:
}' file2 file1 > file3

file3 will contain:
Code:
chr     start           end             ID      gene_name
chr1    2020            3030            1       test1
chr3    6000            12000           4       test3

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

File Comparison

I have to compare two text files, very few of the lines in these files will have some difference in some column. The files size is in GB. Sample lines are as below: 11111122222222333333aaaaaaaaaabbbbbbbbbccccccccdddddd 11111122222222333333aaaaaaaaaabbbbbbbbbccccccccddeddd So assuming these... (19 Replies)
Discussion started by: net_shree
19 Replies

2. UNIX for Dummies Questions & Answers

file comparison...help needed.

Hello all, Can anyone help me with this. There are two files and I have to match the second file records with that of first and if matched, print the output in two fies, one containing the matched records and other containing the rest. Here is the example. File1 "111",erter,"00000", ... (4 Replies)
Discussion started by: er_ashu
4 Replies

3. Shell Programming and Scripting

file comparison

hi I have 2 files to comapre ,in file a sible column it is numbers,in file b2 numbers and other values with coma separated. i want compare numbers in file a with file b,and the out put put should be in C with numbers in both file a and b along with other columns of file b. i used folowing... (7 Replies)
Discussion started by: satish.res
7 Replies

4. Shell Programming and Scripting

File Comparison- Need help

I have two text files which have records of thousand rows. Each row is having around 40 columns. Each column is tab delimited. Each row is delimited by newline character. My requirement is to find for each row i need to find whether any column is different between the two files. For each row i... (8 Replies)
Discussion started by: uihnybgte
8 Replies

5. Shell Programming and Scripting

File Comparison

Hi i have 2 csv files a.csv and b.csv with the same number of columns and a list of values in both of it. Each and every individual value in both the files need to compared and if it matches then print correct in a new csv file otherwise print Incorrect eg a.csv 1,12/27/2007,Reward,$10.00... (5 Replies)
Discussion started by: naveenn08
5 Replies

6. Shell Programming and Scripting

two file comparison

now i have a different file zoo.txt with content 123|zoo 234|natan 456|don and file rick.txt with contents 123|dog|pie|pep 123|tail|see|newt 456|som|sin|sim 234|pay|rat|cat i want to look for lines in file zoo.txt column1 that has same corresponding lines in column 1 of... (6 Replies)
Discussion started by: dealerso
6 Replies

7. Shell Programming and Scripting

CSV file comparison

Hi all, i have two .csv files. i need to compare those two files and if there is any difference that should be moved into third .csv file. example, org.csv and dup.csv when we compare those two files org.csv and dup.csv. if there is any change in dup.csv. it should be capture in third... (7 Replies)
Discussion started by: baskivs
7 Replies

8. Shell Programming and Scripting

Help with file comparison

Hello, I am trying to compare 2 files and get only the new lines as output. Note that new lines can be anywhere in the file and not necessarily at the bottom of the file. I have made the following progress so far. /home/aa>cat old.txt 0001 732 A 0002 732 C 0005 732 D... (7 Replies)
Discussion started by: cartrider
7 Replies

9. Shell Programming and Scripting

File Comparison: Print Lines not present in another file

Hi, I have fileA.txt like this. B01B02 D0011718 B01B03 D0012540 B01B04 D0006145 B01B05 D0004815 B01B06 D0012069 B01B07 D0004064 B01B08 D0011988 B01B09 D0012071 B01B10 D0005596 B01B11 D0011351 B01B12 D0004814 B01C01 D0011804 I want to compare this against another file (fileB.txt)... (3 Replies)
Discussion started by: genehunter
3 Replies

10. Shell Programming and Scripting

File Comparison

HI, I have two files and contains many Fields with | (pipe) delimitor, wanted to compare both the files and get only unmatched perticular fields. this i wanted to use in shell scriting. ex: first.txt 111 |abc| 230| hbc231 |bbb |210 |bbd405 |ghc |555 |cgv second.txt 111 |abc |230 |hbc231... (1 Reply)
Discussion started by: prawinmca
1 Replies
ppmtosixel(1)                                                 General Commands Manual                                                ppmtosixel(1)

NAME
ppmtosixel - convert a portable pixmap into DEC sixel format SYNOPSIS
ppmtosixel [-raw] [-margin] [ppmfile] DESCRIPTION
Reads a portable pixmap as input. Produces sixel commands (SIX) as output. The output is formatted for color printing, e.g. for a DEC LJ250 color inkjet printer. If RGB values from the PPM file do not have maxval=100, the RGB values are rescaled. A printer control header and a color assignment table begin the SIX file. Image data is written in a compressed format by default. A printer control footer ends the image file. OPTIONS
-raw If specified, each pixel will be explicitly described in the image file. If -raw is not specified, output will default to com- pressed format in which identical adjacent pixels are replaced by "repeat pixel" commands. A raw file is often an order of magni- tude larger than a compressed file and prints much slower. -margin If -margin is not specified, the image will be start at the left margin (of the window, paper, or whatever). If -margin is speci- fied, a 1.5 inch left margin will offset the image. PRINTING
Generally, sixel files must reach the printer unfiltered. Use the lpr -x option or cat filename > /dev/tty0?. BUGS
Upon rescaling, truncation of the least significant bits of RGB values may result in poor color conversion. If the original PPM maxval was greater than 100, rescaling also reduces the image depth. While the actual RGB values from the ppm file are more or less retained, the color palette of the LJ250 may not match the colors on your screen. This seems to be a printer limitation. SEE ALSO
ppm(5) AUTHOR
Copyright (C) 1991 by Rick Vinci. 26 April 1991 ppmtosixel(1)
All times are GMT -4. The time now is 08:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy