07-22-2016
The first rule for producing code that works is to get a clear description of the input format(s) and the desired output format(s). Do not expect us to write code for you that magically guesses correctly at inconsistent input file formats. Note that:
- there are no blank lines nor any header lines in the files in post #1, and
- there are blank lines in file1 and headers in both input files in post #4.
We might be able to make adjustments for blank or empty lines in your input files (if we can determine whether or not a blank or empty line is a header).
If you give us a clear description of your input file formats that can be used in ALL cases for your various input file formats such that we could be sure programmatically whether or not a line in an input file is a header or data, we might be able to handle that as well. But, I'm not going to attempt to guess at what might appear as data or as headers, how many lines of headers might be present in each input file, what characters might appear in headers, nor what characters might appear in data in other input files you might throw at us later.
Note also that character string comparisons of field 1 in both sets of sample input files happens to work with the date formats used in those files. That might not be true for other date and time formats. (And, if the format used in file1 is not the same format as the date and time format in used in file2 in any pair of input files, the code needed to normalize date and time strings for comparison might be a significant research project unless you pass your script a clear description of the date and time formats used in each input file.)
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
Am new to scripting. So i just need your ideas to help me out. Here goes my requirement.
I have two csv files
1.csv 2.csv
abc,1.24 abc,1
def,2.13 def,1
I need to compare the first column of 1.csv with 2.csv and if matches then need to compare... (2 Replies)
Discussion started by: chinnahyd
2 Replies
2. Shell Programming and Scripting
Hi Guys and Gals,
I'm having some difficulty putting this check into a shell script. I would like to search a particular directory for a number of files. The logic I have is pretty simple:
Find file named *.txt that are newer than <this file> and count them
If the number of files is equal to... (4 Replies)
Discussion started by: bbbngowc
4 Replies
3. Shell Programming and Scripting
BASH problem with IS GREATER THAN OR EQUAL TO.
I have tried a dozen variations for this IF statement to work with IS GREATER THAN OR EQUAL TO. My code below WORKS.
array=( $( /usr/bin/sar -q 1 30 |grep Average |awk '{print $2,$3}' ) )
nthreads="${array}"
avproc="${array}"
if && ; then ... (6 Replies)
Discussion started by: diex
6 Replies
4. Shell Programming and Scripting
my files are as follows
fileA sepearated by tab /t
00 lieferungen
00 attractiop
01 done
02 forness
03 rasp
04 alwaysisng
04 funny
05 done1
fileB
funnymou120112
funnymou234470
mou3raspnhdhv
rddfgmoudone1438748
so all those record which are greater than 3 and which are not... (4 Replies)
Discussion started by: rajniman
4 Replies
5. UNIX for Dummies Questions & Answers
I have a data file that has 14 columns. I cannot use awk or perl but sed is installed on my host. I would like to delete a line if fields 10, 11 or twelve is greater than 999.99. How is this done using sed? :wall:
sed '/^*,*,*,*,*,*,*,*,*,*,*,*,*,*,/d' infile
1 2 3 4 ... (2 Replies)
Discussion started by: Chris Eagleson
2 Replies
6. Shell Programming and Scripting
(say) I have 2 csv files - file1.csv & file2.csv as mentioned below:
file1.csv
ID,version,cost
1000,1,30
2000,2,40
3000,3,50
4000,4,60
file2.csv
ID,version,cost
1000,1,30
2000,2,45
3000,4,55
6000,5,70
The... (7 Replies)
Discussion started by: Naresh101
7 Replies
7. Shell Programming and Scripting
Hi, I am newbie in shell script.
I need your help to solve my problem.
Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file.
File1:
SourceFile,DateTimeOriginal
/home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies
8. Shell Programming and Scripting
Hi,
I am having below two CSV's
col_1,col_2,col_3
1,2,4
1,3,6
col_1,col_3,col2,col_5,col_6
1,2,3,4,5
1,6,3,,,
I need to compare based on the columns where the mismatch is
expected output
col_1,col_2,col_3
1,2,4 (3 Replies)
Discussion started by: rohit_shinez
3 Replies
9. Shell Programming and Scripting
Example:
I have files in below format
file 1:
zxc,133,joe@example.com
cst,222,xyz@example1.com
File 2 Contains:
hxd
hcd
jws
zxc
cst
File 1 has 50000 lines and file 2 has around 30000 lines :
Expected Output has to be :
hxd
hcd
jws (5 Replies)
Discussion started by: TestPractice
5 Replies
10. UNIX for Beginners Questions & Answers
1.csv contains following column-
Empid code loc port
101 A xy 01
102 B zx 78
103 A cg 12
104 G xy 78
2.csv contains follwing data-
Empid code loc port
101 A gf 01
102 B zx 78
103 C cg 32
104 ... (1 Reply)
Discussion started by: rishabh
1 Replies
LEARN ABOUT DEBIAN
vcf-compare
VCF-COMPARE(1) User Commands VCF-COMPARE(1)
NAME
vcf-compare - compare bgzipped and tabix indexed VCF files
SYNOPSIS
compare-vcf [OPTIONS] file1.vcf file2.vcf ...
DESCRIPTION
About: Compare bgzipped and tabix indexed VCF files. (E.g. bgzip file.vcf; tabix -p vcf file.vcf.gz)
OPTIONS
-c, --chromosomes <list|file>
Same as -r, left for backward compatibility. Please do not use as it will be dropped in the future.
-d, --debug
Debugging information. Giving the option multiple times increases verbosity
-H, --cmp-haplotypes
Compare haplotypes, not only positions
-m, --name-mapping <list|file>
Use with -H when comparing files with differing column names. The argument to this options is a comma-separated list or one mapping
per line in a file. The names are colon separated and must appear in the same order as the files on the command line.
-R, --refseq <file>
Compare the actual sequence, not just positions. Use with -w to compare indels.
-r, --regions <list|file>
Process the given regions (comma-separated list or one region per line in a file).
-s, --samples <list>
Process only the listed samples. Excluding unwanted samples may increase performance considerably.
-w, --win <int>
In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a
SNP or an indel).
-h, -?, --help
This help message.
vcf-compare 0.1.5 July 2011 VCF-COMPARE(1)