Extracting combined differences based on a single column
Dear All,
I have two sets of files. File 1 can be any number between 1 and 20 followed by a frequency of that number in a give documents... the lines in the file will be dependent to the analysed document. e.g.
file1
Code:
1,5
4,1
then I have file two which is basicall same numbers but with frequency of 0
file2
Code:
1,0
2,0
3,0
4,0
...
20,0
I want to create a third file which will map the file1 on file2
file3 - outcome
Code:
1,5
2,0
3,0
4,1
...
20,0
I have tried
Code:
awk -F, 'FNR==NR{X[$1]=$2;next}$1 in X{print $1","X[$1]==$2;next}{print $1","0}' file1 file2
but it does not print what is required and puts 0 for any item from file1
Can you please help?
Hi,
I have a column in 2 different files which i want to compare, and output the results to a different file. The columns are in different positions in those 2 files.
File 1 the column is in position 10-15
File 2 the column is in position 15-20
Please advise
Thanks (1 Reply)
I am a newbie to shell scripting ..
I have a .csv file. It has 1000 some rows and about 7 columns...
but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type...
example below..
column 1 ... (2 Replies)
Hello,
I am new to shell scripting. I have a huge file with multiple columns for example:
I have 5 columns below.
HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG
HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL
HWUSI-EAS000_29:1:108 + ... (4 Replies)
I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now:
100 rs3794811 0.01 0.3434... (1 Reply)
I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now:
100 rs3794811 0.01 0.3434
100... (1 Reply)
I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001.
My input:
rs10082730 9e-08 12 46002702
rs2544081 1e-07 12 46015487
rs1425136 1e-06 7 35396742
rs2712590... (1 Reply)
Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column.
e.g.
a 1
a 2
a 3
b 1
c 1
gives
b 1
c 1
but requires 11 duplicates before it deletes.
Thanks for the help
Video tutorial on how to use code tags in The UNIX... (11 Replies)
I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
Hi all,
I am new to shell script.I need your help to write a shell script.
I need to write a shell script to extract data from a .csv file where columns are ',' separated.
The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Dear All,
I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional).
I have multiple tab-delimited files with the following structure:
file1:
1 44
2 ... (5 Replies)
Discussion started by: Bastami
5 Replies
LEARN ABOUT REDHAT
sdiff
SDIFF(1) GNU Tools SDIFF(1)NAME
sdiff - find differences between two files and merge interactively
SYNOPSIS
sdiff -o outfile [options] from-file to-file
DESCRIPTION
The sdiff command merges two files and interactively outputs the results to outfile.
If from-file is a directory and to-file is not, sdiff compares the file in from-file whose file name is that of to-file, and vice versa.
from-file and to-file may not both be directories.
sdiff options begin with -, so normally from-file and to-file may not begin with -. However, -- as an argument by itself treats the
remaining arguments as file names even if they begin with -. You may not use - as an input file.
sdiff without -o (or --output) produces a side-by-side difference. This usage is obsolete; use diff --side-by-side instead.
Options
Below is a summary of all of the options that GNU sdiff accepts. Each option has two equivalent names, one of which is a single letter
preceded by -, and the other of which is a long name preceded by --. Multiple single letter options (unless they take an argument) can be
combined into a single command line argument. Long named options can be abbreviated to any unique prefix of their name.
-a Treat all files as text and compare them line-by-line, even if they do not appear to be text.
-b Ignore changes in amount of white space.
-B Ignore changes that just insert or delete blank lines.
-d Change the algorithm to perhaps find a smaller set of changes. This makes sdiff slower (sometimes much slower).
-H Use heuristics to speed handling of large files that have numerous scattered small changes.
--expand-tabs
Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files.
-i Ignore changes in case; consider upper- and lower-case to be the same.
-I regexp
Ignore changes that just insert or delete lines that match regexp.
--ignore-all-space
Ignore white space when comparing lines.
--ignore-blank-lines
Ignore changes that just insert or delete blank lines.
--ignore-case
Ignore changes in case; consider upper- and lower-case to be the same.
--ignore-matching-lines=regexp
Ignore changes that just insert or delete lines that match regexp.
--ignore-space-change
Ignore changes in amount of white space.
-l
--left-column
Print only the left column of two common lines.
--minimal
Change the algorithm to perhaps find a smaller set of changes. This makes sdiff slower (sometimes much slower).
-o file
--output=file
Put merged output into file. This option is required for merging.
-s
--suppress-common-lines
Do not print common lines.
--speed-large-files
Use heuristics to speed handling of large files that have numerous scattered small changes.
-t Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files.
--text Treat all files as text and compare them line-by-line, even if they do not appear to be text.
-v
--version
Output the version number of sdiff.
-w columns
--width=columns
Use an output width of columns. Note that for historical reasons, this option is -W in diff, -w in sdiff.
-W Ignore horizontal white space when comparing lines. Note that for historical reasons, this option is -w in diff, -W in sdiff.
SEE ALSO cmp(1), comm(1), diff(1), diff3(1).
DIAGNOSTICS
An exit status of 0 means no differences were found, 1 means some differences were found, and 2 means trouble.
GNU Tools 22sep1993 SDIFF(1)