04-25-2006
File operations
Hi
I have a tab delimited file with 3 fields. I need to sort this file on the first field and remove all the records where the first field has dulplicates. For eg my file is
133|arrfdfdg|sdfdsg
234|asfsdgfs|aasdfs
133|affbfsde|dgfg
When this file gets sorted I need the result to be
234|asfsdgfs|aasdfs
So if there are duplicate entries in the first column, all those records should be removed. How can I do this in unix? I am able to sort it to get single records based on unique first field using
sort -u -k 1,1 filename
but this is not what I am looking for. Any help will be appreciated!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi there,
I want some help on scripting regarding file processing.
I have a scenario in which I have 10 files. (file1.txt, file2.txt....) and they are in paricular format.
I want to read these files and append some text lines at the begining of each file and write this updated contents of... (2 Replies)
Discussion started by: chiragmistry21
2 Replies
2. Shell Programming and Scripting
Hi,
I want to compare two files. Files will look like as follows:
file1:
ASDFGHJU|1234567890123456
QWERTYUI|3456789098900890
file2:
ZXCVBVNM|0987654321234567
POLKIJUYH|1234789060985478
output file should be:
ASDFGHJU|1234567890123456
QWERTYUI|3456789098900890
Thnaks in advance (6 Replies)
Discussion started by: nivas
6 Replies
3. Shell Programming and Scripting
Hi,
Iam having the two files as follows:
file1:
ASQWEDFR09876543121234512
POIUYTREW09876512345676788
ZXCVBNMKS1209888888888888
file2:
ASQWEDFR09876543121234516 asdcvfgbtg@abc.com 0000000-90-1239--2008 8990----
CXADFGTU09876543121234789 asdcvfgbtg@abc.com ... (14 Replies)
Discussion started by: nivas
14 Replies
4. Shell Programming and Scripting
hi All,
my query...
1.I Have to search for the files in the root directory.
2.i have to search for a pattern in all the files in the root directory and then replace them with a new pattern.
3.Rename the file
Explanation:
if ABC is the root folder and has 3 subfolders and there are 15... (9 Replies)
Discussion started by: adityamahi
9 Replies
5. UNIX for Dummies Questions & Answers
Hi,
I have a file with thousands of lines like this:
Chr1 477515 . ACCCC ACCC 17.7 . INDEL;DP=17;AF1=1;CI95=0.5,1;DP4=0,1,0,3;MQ=32;PV4=1,0.036,1,1
Chr1 481987 . A AAAT 62 . INDEL;DP=11;AF1=1;CI95=0.5,1;DP4=0,0,1,3;MQ=41
I want to make a file with... (2 Replies)
Discussion started by: fadista
2 Replies
6. Shell Programming and Scripting
Hello all,
I am looking for a solution to the following problem. Perl or python solutions also welcome.
Given this input:
And this input:
I want to get this output.
The rule being that if the number in the first file is < 0.9, then the corresponding two columns on... (2 Replies)
Discussion started by: hydrabane
2 Replies
7. Shell Programming and Scripting
Hi ,
I have a file myhost.txt which contains below,
127.0.0.1 localhost
1.17.1.5 atrpx958
11.17.10.11 atrpx958zone nsybhost
I need to append words only after "atrpx958" like 'myhost' and 'libhost' and not after atrpx958zone.
How to search the word atrpx958(which is hostname) only,... (5 Replies)
Discussion started by: gsreeni
5 Replies
8. Programming
i am reading and writing to a a file in C language. the input file is described as follows
111 aaa descr1
222 bbb descr2
333 ccc <SPACE> {6 spaces are left after ccc i.e in 3rd column}
444 ddd descr4
when i read and write to a file, the space is not coming in the output file.... (8 Replies)
Discussion started by: vkca
8 Replies
9. Shell Programming and Scripting
Hi all,
i need to do a piecewise integration between this example data inside a file :
500856704.00 11536282.5600897 50496
500402784.00 11538000.3654401 -453920
500654880.00 11538000.4662785 252096
500604416.00 11539718.4330113 -50464
500907168.00 11539718.5541121 302752
500705280.00... (3 Replies)
Discussion started by: Board27
3 Replies
10. Shell Programming and Scripting
Hi Folks,
Below is example of an Input data which is used, based on the last 2, 3 & 4 column, I want my first column data to be collated as shown in the output section.
a,ac,tc,ic
b,ac,tc,ic
c,ac,tc,ic
d,ac,tc,ic
b,bc,tc,ic
d,bc,tc,ic
e,bc,tc,ic
I want my output to be
... (2 Replies)
Discussion started by: nikhil jain
2 Replies
SORTBIB(1) General Commands Manual SORTBIB(1)
NAME
sortbib - sort bibliographic database
SYNOPSIS
sortbib [ -sKEYS ] database ...
DESCRIPTION
Sortbib sorts files of records containing refer key-letters by user-specified keys. Records may be separated by blank lines, or by .[ and
.] delimiters, but the two styles may not be mixed together. This program reads through each database and pulls out key fields, which are
sorted separately. The sorted key fields contain the file pointer, byte offset, and length of corresponding records. These records are
delivered using disk seeks and reads, so sortbib may not be used in a pipeline to read standard input.
By default, sortbib alphabetizes by the first %A and the %D fields, which contain the senior author and date. The -s option is used to
specify new KEYS. For instance, -sATD will sort by author, title, and date, while -sA+D will sort by all authors, and date. Sort keys
past the fourth are not meaningful. No more than 16 databases may be sorted together at one time. Records longer than 4096 characters
will be truncated.
Sortbib sorts on the last word on the %A line, which is assumed to be the author's last name. A word in the final position, such as
``jr.'' or ``ed.'', will be ignored if the name beforehand ends with a comma. Authors with two-word last names or unusual constructions
can be sorted correctly by using the nroff convention `` '' in place of a blank. A %Q field is considered to be the same as %A, except
sorting begins with the first, not the last, word. Sortbib sorts on the last word of the %D line, usually the year. It also ignores lead-
ing articles (like ``A'' or ``The'') when sorting by titles in the %T or %J fields; it will ignore articles of any modern European lan-
guage. If a sort-significant field is absent from a record, sortbib places that record before other records containing that field.
SEE ALSO
refer(1), addbib(1), roffbib(1), indxbib(1), lookbib(1)
AUTHORS
Greg Shenaut, Bill Tuthill
BUGS
Records with missing author fields should probably be sorted by title.
4.2 Berkeley Distribution April 29, 1985 SORTBIB(1)