Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extracting combined differences based on a single column Post 302895771 by A-V on Wednesday 2nd of April 2014 10:35:40 AM
Old 04-02-2014
Extracting combined differences based on a single column

Dear All,
I have two sets of files. File 1 can be any number between 1 and 20 followed by a frequency of that number in a give documents... the lines in the file will be dependent to the analysed document. e.g.
file1
Code:
1,5
4,1

then I have file two which is basicall same numbers but with frequency of 0
file2
Code:
1,0
2,0
3,0
4,0
...
20,0

I want to create a third file which will map the file1 on file2
file3 - outcome
Code:
1,5
2,0
3,0
4,1
...
20,0

I have tried
Code:
awk -F, 'FNR==NR{X[$1]=$2;next}$1 in X{print $1","X[$1]==$2;next}{print $1","0}' file1 file2

but it does not print what is required and puts 0 for any item from file1
Can you please help?

Last edited by A-V; 04-03-2014 at 08:51 AM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Compare 2 files for a single column and output differences

Hi, I have a column in 2 different files which i want to compare, and output the results to a different file. The columns are in different positions in those 2 files. File 1 the column is in position 10-15 File 2 the column is in position 15-20 Please advise Thanks (1 Reply)
Discussion started by: samit_9999
1 Replies

2. Shell Programming and Scripting

duplicate row based on single column

I am a newbie to shell scripting .. I have a .csv file. It has 1000 some rows and about 7 columns... but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type... example below.. column 1 ... (2 Replies)
Discussion started by: mitr
2 Replies

3. Shell Programming and Scripting

remove duplicates based on single column

Hello, I am new to shell scripting. I have a huge file with multiple columns for example: I have 5 columns below. HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies

4. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the first column

I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now: 100 rs3794811 0.01 0.3434... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the first column

I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now: 100 rs3794811 0.01 0.3434 100... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on numerical values of a column

I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001. My input: rs10082730 9e-08 12 46002702 rs2544081 1e-07 12 46015487 rs1425136 1e-06 7 35396742 rs2712590... (1 Reply)
Discussion started by: evelibertine
1 Replies

7. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

8. UNIX for Dummies Questions & Answers

Extracting rows from a space delimited text file based on the values of a column

I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

9. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

10. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Dear All, I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional). I have multiple tab-delimited files with the following structure: file1: 1 44 2 ... (5 Replies)
Discussion started by: Bastami
5 Replies
SDIFF(1)							     GNU Tools								  SDIFF(1)

NAME
sdiff - find differences between two files and merge interactively SYNOPSIS
sdiff -o outfile [options] from-file to-file DESCRIPTION
The sdiff command merges two files and interactively outputs the results to outfile. If from-file is a directory and to-file is not, sdiff compares the file in from-file whose file name is that of to-file, and vice versa. from-file and to-file may not both be directories. sdiff options begin with -, so normally from-file and to-file may not begin with -. However, -- as an argument by itself treats the remaining arguments as file names even if they begin with -. You may not use - as an input file. sdiff without -o (or --output) produces a side-by-side difference. This usage is obsolete; use diff --side-by-side instead. Options Below is a summary of all of the options that GNU sdiff accepts. Each option has two equivalent names, one of which is a single letter preceded by -, and the other of which is a long name preceded by --. Multiple single letter options (unless they take an argument) can be combined into a single command line argument. Long named options can be abbreviated to any unique prefix of their name. -a Treat all files as text and compare them line-by-line, even if they do not appear to be text. -b Ignore changes in amount of white space. -B Ignore changes that just insert or delete blank lines. -d Change the algorithm to perhaps find a smaller set of changes. This makes sdiff slower (sometimes much slower). -H Use heuristics to speed handling of large files that have numerous scattered small changes. --expand-tabs Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files. -i Ignore changes in case; consider upper- and lower-case to be the same. -I regexp Ignore changes that just insert or delete lines that match regexp. --ignore-all-space Ignore white space when comparing lines. --ignore-blank-lines Ignore changes that just insert or delete blank lines. --ignore-case Ignore changes in case; consider upper- and lower-case to be the same. --ignore-matching-lines=regexp Ignore changes that just insert or delete lines that match regexp. --ignore-space-change Ignore changes in amount of white space. -l --left-column Print only the left column of two common lines. --minimal Change the algorithm to perhaps find a smaller set of changes. This makes sdiff slower (sometimes much slower). -o file --output=file Put merged output into file. This option is required for merging. -s --suppress-common-lines Do not print common lines. --speed-large-files Use heuristics to speed handling of large files that have numerous scattered small changes. -t Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files. --text Treat all files as text and compare them line-by-line, even if they do not appear to be text. -v --version Output the version number of sdiff. -w columns --width=columns Use an output width of columns. Note that for historical reasons, this option is -W in diff, -w in sdiff. -W Ignore horizontal white space when comparing lines. Note that for historical reasons, this option is -w in diff, -W in sdiff. SEE ALSO
cmp(1), comm(1), diff(1), diff3(1). DIAGNOSTICS
An exit status of 0 means no differences were found, 1 means some differences were found, and 2 means trouble. GNU Tools 22sep1993 SDIFF(1)
All times are GMT -4. The time now is 08:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy