Apologies in case i am disturbing you with my posts.
I am not much good with awk scripting but I do shell scripting and try to learn more with the issues i come across
But sincerely i need to know work around for this query.
I tried the below code, but it is not working as per my expectation.
It is working when column 2 contains unique value in every row, but if row 2 and row 5 contains same value, it prints "error".
Request if you can help to improve.
I need to have file suppose that contains duplicate values in column 1 then against those duplicate value in column 2 there should be unique values
In above sample file.
There wont' be 2 line per document number, there can be any number of duplicate values, it can be more than 5 or even 50
Yes, there should be non-identical number in column2(Line number) per Document number(column1) and there is no limit on number, they just has to be non duplicate.
if column 1 contain duplicate values in row then corresponding to those duplicate values in row column 2 should contain non duplicate values
Moderator's Comments:
Please use CODE tags as required by forum rules!
Last edited by RudiC; 01-02-2018 at 06:32 AM..
Reason: Added CODE tags.
Hi all,
I have a huge csv file with the following format of data,
Num SNPs, 549997
Total SNPs,555352
Num Samples, 157
SNP, SampleID, Allele1, Allele2
A001,AB1,A,A
A002,AB1,A,A
A003,AB1,A,A
...
...
...
I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Hi,
Need to find a duplicate records on the first column,
ANU4501710430989 0000000W20389390
ANU4501710430989 0000000W67065483
ANU4501130050520 0000000W80838713
ANU4501210170685 0000000W69246611... (3 Replies)
Hi Unix gurus,
Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me.
File format: CSV file
File has four columns with no header... (8 Replies)
Hi,
I want to remove duplicate records including the first line based on column1. For example
inputfile(filer.txt):
-------------
1,3000,5000
1,4000,6000
2,4000,600
2,5000,700
3,60000,4000
4,7000,7777
5,999,8888
expected output:
----------------
3,60000,4000
4,7000,7777... (5 Replies)
I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1.
https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html
Can someone explain this command please? How are there no... (5 Replies)
I have a .CSV file with the below format:
"column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10
"12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""... (2 Replies)
Hi,
I have to output a new csv file from an input csv file with first unique value in the first column.
input csv file
color product id status
green 102 pass
yellow 201 hold
yellow 202 keep
green 101 ok
green 103 hold
yellow 203 ... (5 Replies)
cat sample.csv
ID,Name,no
1,AAA,1
2,BBB,1
3,AAA,1
4,BBB,1
cut -d',' -f2 sample.csv | sort | uniq
this gives only the 2nd column values
Name
AAA
BBB
How to I get all the columns of CSV along with this? (1 Reply)
Hello,
I have a script that is generating a tab delimited output file.
num Name PCA_A1 PCA_A2 PCA_A3
0 compound_00 -3.5054 -1.1207 -2.4372
1 compound_01 -2.2641 0.4287 -1.6120
3 compound_03 -1.3053 1.8495 ... (3 Replies)
I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789... (5 Replies)
Discussion started by: as7951
5 Replies
LEARN ABOUT DEBIAN
psc
PSC(1) General Commands Manual PSC(1)NAME
psc - prepare sc files
SYNOPSIS
psc [-fLkrSPv] [-s cell] [-R n] [-C n] [-n n] [-d c]
DESCRIPTION
Psc is used to prepare data for input to the spreadsheet calculator sc(1). It accepts normal ascii data on standard input. Standard out-
put is a sc file. With no options, psc starts the spreadsheet in cell A0. Strings are right justified. All data on a line is entered on
the same row; new input lines cause the output row number to increment by one. The default delimiters are tab and space. The column for-
mats are set to one larger than the number of columns required to hold the largest value in the column.
OPTIONS -f Omit column width calculations. This option is for preparing data to be merged with an existing spreadsheet. If the option is not
specified, the column widths calculated for the data read by psc will override those already set in the existing spreadsheet.
-L Left justify strings.
-k Keep all delimiters. This option causes the output cell to change on each new delimiter encountered in the input stream. The
default action is to condense multiple delimiters to one, so that the cell only changes once per input data item.
-r Output the data by row first then column. For input consisting of a single column, this option will result in output of one row
with multiple columns instead of a single column spreadsheet.
-s cell
Start the top left corner of the spreadsheet in cell. For example, -s B33 will arrange the output data so that the spreadsheet
starts in column B, row 33.
-R n Increment by n on each new output row.
-C n Increment by n on each new output column.
-n n Output n rows before advancing to the next column. This option is used when the input is arranged in a single column and the
spreadsheet is to have multiple columns, each of which is to be length n.
-d c Use the single character c as the delimiter between input fields.
-P Plain numbers only. A field is a number only when there is no imbedded [-+eE].
-S All numbers are strings.
-v Print the version of psc
SEE ALSO sc(1)AUTHOR
Robert Bond
PSC 7.16 19 September 2002 PSC(1)