Sponsored Content
Full Discussion: File Comparison
Top Forums Shell Programming and Scripting File Comparison Post 302155042 by ghostdog74 on Wednesday 2nd of January 2008 07:04:26 PM
Old 01-02-2008
Quote:
Originally Posted by dislusive
If I understand what you're trying to do correctly, here's a quick bash script.

Code:
#!/bin/bash

compareFile = "/path/to/file/to/compare.txt"
outputFile = "/path/to/outputFile.txt"

for filename in /some/dir/of/text/files/*.txt; do 
        
        numlines=`cat $filename | wc -l`
                
        for i in `seq 1 $numlines`; do 
                current=`cat $filename | head -$i | tail -1` 
 
                grep -q "${current}" ${compareFile} 
 
                if [ $? != 0 ]; then
                         #doesn't exist, append to $outputFile
                        echo "${filename}:${current}" >> ${outputFile} 
                fi
        done 
done

As mentioned by OP, the files are in GB. I think there will be some performance lag. just a guess.
Also seq is not a standard command in some *nix OS. Therefore if you want to use loops that loop over a counter, a while loop can be used instead. eg while [ $num -le $numlines ]
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

file comparison...help needed.

Hello all, Can anyone help me with this. There are two files and I have to match the second file records with that of first and if matched, print the output in two fies, one containing the matched records and other containing the rest. Here is the example. File1 "111",erter,"00000", ... (4 Replies)
Discussion started by: er_ashu
4 Replies

2. Shell Programming and Scripting

file comparison

hi I have 2 files to comapre ,in file a sible column it is numbers,in file b2 numbers and other values with coma separated. i want compare numbers in file a with file b,and the out put put should be in C with numbers in both file a and b along with other columns of file b. i used folowing... (7 Replies)
Discussion started by: satish.res
7 Replies

3. Shell Programming and Scripting

File Comparison- Need help

I have two text files which have records of thousand rows. Each row is having around 40 columns. Each column is tab delimited. Each row is delimited by newline character. My requirement is to find for each row i need to find whether any column is different between the two files. For each row i... (8 Replies)
Discussion started by: uihnybgte
8 Replies

4. Shell Programming and Scripting

File Comparison

Hi i have 2 csv files a.csv and b.csv with the same number of columns and a list of values in both of it. Each and every individual value in both the files need to compared and if it matches then print correct in a new csv file otherwise print Incorrect eg a.csv 1,12/27/2007,Reward,$10.00... (5 Replies)
Discussion started by: naveenn08
5 Replies

5. Shell Programming and Scripting

two file comparison

now i have a different file zoo.txt with content 123|zoo 234|natan 456|don and file rick.txt with contents 123|dog|pie|pep 123|tail|see|newt 456|som|sin|sim 234|pay|rat|cat i want to look for lines in file zoo.txt column1 that has same corresponding lines in column 1 of... (6 Replies)
Discussion started by: dealerso
6 Replies

6. Shell Programming and Scripting

CSV file comparison

Hi all, i have two .csv files. i need to compare those two files and if there is any difference that should be moved into third .csv file. example, org.csv and dup.csv when we compare those two files org.csv and dup.csv. if there is any change in dup.csv. it should be capture in third... (7 Replies)
Discussion started by: baskivs
7 Replies

7. Shell Programming and Scripting

Help with file comparison

Hello, I am trying to compare 2 files and get only the new lines as output. Note that new lines can be anywhere in the file and not necessarily at the bottom of the file. I have made the following progress so far. /home/aa>cat old.txt 0001 732 A 0002 732 C 0005 732 D... (7 Replies)
Discussion started by: cartrider
7 Replies

8. Shell Programming and Scripting

file comparison

Dear All, I would really appreciate if you can help me to resolve this file comparison I have two files: file1: chr start end ID gene_name chr1 2020 3030 1 test1 chr1 900 5000 2 test1 chr2 5000 8000 3 test2 chr3 6000 12000 4 test3 chr3 6000 15000 5 test3 file2:... (2 Replies)
Discussion started by: paolo.kunder
2 Replies

9. Shell Programming and Scripting

File Comparison: Print Lines not present in another file

Hi, I have fileA.txt like this. B01B02 D0011718 B01B03 D0012540 B01B04 D0006145 B01B05 D0004815 B01B06 D0012069 B01B07 D0004064 B01B08 D0011988 B01B09 D0012071 B01B10 D0005596 B01B11 D0011351 B01B12 D0004814 B01C01 D0011804 I want to compare this against another file (fileB.txt)... (3 Replies)
Discussion started by: genehunter
3 Replies

10. Shell Programming and Scripting

File Comparison

HI, I have two files and contains many Fields with | (pipe) delimitor, wanted to compare both the files and get only unmatched perticular fields. this i wanted to use in shell scriting. ex: first.txt 111 |abc| 230| hbc231 |bbb |210 |bbd405 |ghc |555 |cgv second.txt 111 |abc |230 |hbc231... (1 Reply)
Discussion started by: prawinmca
1 Replies
GENE2XML(1)						     NCBI Tools User's Manual						       GENE2XML(1)

NAME
gene2xml - convert NCBI Entrez Gene ASN.1 into XML SYNOPSIS
gene2xml [-] [-b] [-c] [-i filename] [-l] [-o filename] [-p path] [-r path] [-t N] [-x] [-y] [-z] DESCRIPTION
gene2xml is a stand-alone program that converts Entrez Gene ASN.1 into XML. Entrez Gene data are stored as compressed binary Entrezgene- Set ASN.1 files on the NCBI ftp site, and have the suffix .ags.gz. These are several-fold smaller than compressed XML files, resulting in a significant savings of disk storage and network bandwidth. Normal processing by gene2xml produces text XML files with the same name but with .xgs as the suffix. OPTIONS
A summary of options is included below. - Print usage message -b File is Binary -c File is Compressed -i filename Single Input file (standard input by default) when not using -p -l Log processing (list files processed when using -p) -o filename Single Output file (standard output by default) when not using -p -p path Path to Files (if processing an entire directory) -r path Path for Results when using -p; defaults to the input directory -t N Limit to the given Taxon ID (per http://www.ncbi.nlm.nih.gov/Taxonomy/) -x Extract .ags to text .agc (format previously distributed) -y Combine .agc to text .ags (for testing) -z Combine .agc to binary .ags, then gzip AUTHOR
The National Center for Biotechnology Information. SEE ALSO
asn2all(1), asn2asn(1), asn2xml(1), asndhuff(1), /usr/share/doc/ncbi-tools-bin/gene2xml.txt.gz, /usr/share/doc/libncbi6/ncbixml.txt.gz NCBI
2005-05-16 GENE2XML(1)
All times are GMT -4. The time now is 06:55 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy