Sponsored Content
Top Forums Shell Programming and Scripting Find unique lines based off of bytes Post 302852307 by jl487 on Wednesday 11th of September 2013 07:41:31 AM
Old 09-11-2013
Find unique lines based off of bytes

Hello All,
I have two VERY large .csv files that I want to compare values based on substrings. If the lines are unique, then print the line.

For example, if I run a

Code:
diff file1.csv and file2.csv

I get results similar to
Code:
+_id34,brown,car,2006
+_id1,blue,train,1985
+_id73,white,speed_boat,1990
-_id34,brown,car,2006
-_id72,white,plane,2010
-_id73,white,speed_boat,1990

I want to compare the ids (string between "_" and ",") and if it's unique, then print the line so my output would be like the following:

Output:
Code:
+_id1,blue,train,1985
-_id72,white,plane,2010

I was thinking I could sue the cut command and delimit on the first "_" but didn't know how to compare all the values up until you reach the first comma.

Any suggestions?

Last edited by Scott; 09-13-2013 at 12:39 PM.. Reason: Code tags for input and output too
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

2. Shell Programming and Scripting

parsing file based on characters/bytes

I have a datafile that is formatted as fixed. I know that each line should contain 880 characters. I want to separate the file into 2 files, one that has lines with 880 characters and the other file with everything else. Is this possible ? (9 Replies)
Discussion started by: cheeko111
9 Replies

3. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

4. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

5. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

7. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

8. Shell Programming and Scripting

Based on column in file1, find match in file2 and print matching lines

file1: file2: I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading): This is honestly beyond what I can do without spending the whole night on it, so I'm... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

9. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

10. UNIX for Beginners Questions & Answers

Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies
IO::Async::Protocol::LineStream(3pm)			User Contributed Perl Documentation		      IO::Async::Protocol::LineStream(3pm)

NAME
"IO::Async::Protocol::LineStream" - stream-based protocols using lines of text SYNOPSIS
Most likely this class will be subclassed to implement a particular network protocol. package Net::Async::HelloWorld; use strict; use warnings; use base qw( IO::Async::Protocol::LineStream ); sub on_read_line { my $self = shift; my ( $line ) = @_; if( $line =~ m/^HELLO (.*)/ ) { my $name = $1; $self->invoke_event( on_hello => $name ); } } sub send_hello { my $self = shift; my ( $name ) = @_; $self->write_line( "HELLO $name" ); } This small example elides such details as error handling, which a real protocol implementation would be likely to contain. DESCRIPTION
EVENTS
The following events are invoked, either using subclass methods or CODE references in parameters: on_read_line $line Invoked when a new complete line of input is received. PARAMETERS
The following named parameters may be passed to "new" or "configure": on_read_line => CODE CODE reference for the "on_read_line" event. METHODS
$lineprotocol->write_line( $text ) Writes a line of text to the transport stream. The text will have the end-of-line marker appended to it; $text should not end with it. AUTHOR
Paul Evans <leonerd@leonerd.org.uk> perl v5.14.2 2012-10-24 IO::Async::Protocol::LineStream(3pm)
All times are GMT -4. The time now is 06:09 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy