Sponsored Content
Top Forums Shell Programming and Scripting Find unique lines based off of bytes Post 302852307 by jl487 on Wednesday 11th of September 2013 07:41:31 AM
Old 09-11-2013
Find unique lines based off of bytes

Hello All,
I have two VERY large .csv files that I want to compare values based on substrings. If the lines are unique, then print the line.

For example, if I run a

Code:
diff file1.csv and file2.csv

I get results similar to
Code:
+_id34,brown,car,2006
+_id1,blue,train,1985
+_id73,white,speed_boat,1990
-_id34,brown,car,2006
-_id72,white,plane,2010
-_id73,white,speed_boat,1990

I want to compare the ids (string between "_" and ",") and if it's unique, then print the line so my output would be like the following:

Output:
Code:
+_id1,blue,train,1985
-_id72,white,plane,2010

I was thinking I could sue the cut command and delimit on the first "_" but didn't know how to compare all the values up until you reach the first comma.

Any suggestions?

Last edited by Scott; 09-13-2013 at 12:39 PM.. Reason: Code tags for input and output too
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

2. Shell Programming and Scripting

parsing file based on characters/bytes

I have a datafile that is formatted as fixed. I know that each line should contain 880 characters. I want to separate the file into 2 files, one that has lines with 880 characters and the other file with everything else. Is this possible ? (9 Replies)
Discussion started by: cheeko111
9 Replies

3. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

4. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

5. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

7. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

8. Shell Programming and Scripting

Based on column in file1, find match in file2 and print matching lines

file1: file2: I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading): This is honestly beyond what I can do without spending the whole night on it, so I'm... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

9. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

10. UNIX for Beginners Questions & Answers

Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies
IRSEND(1)								FSF								 IRSEND(1)

NAME
irsend - basic LIRC program to send infra-red commands SYNOPSIS
irsend [options] DIRECTIVE REMOTE CODE [CODE...] DESCRIPTION
Asks the lircd daemon to send one or more CIR (Consumer Infra-Red) commands. This is intended for remote control of electronic devices such as TV boxes, HiFi sets, etc. DIRECTIVE can be: SEND_ONCE - send CODE [CODE ...] once SEND_START - start repeating CODE SEND_STOP - stop repeating CODE LIST - list configured remote items SET_TRANSMITTERS - set transmitters NUM [NUM ...] SIMULATE - simulate IR event REMOTE is the name of a remote, as described in the lircd configuration file. CODE is the name of a remote control key of REMOTE, as it appears in the lircd configuration file. NUM is the transmitter number of the hardware device. For the LIST DIRECTIVE, REMOTE and/or CODE can be empty: LIST "" "" - list all configured remote names LIST REMOTE "" - list all codes of REMOTE LIST REMOTE CODE - list only CODE of REMOTE The SIMULATE command only works if it has been explicitly enabled in lircd. -h --help display usage summary -v --version display version -d --device use given lircd socket [/var/run/lirc/lircd] -a --address=host[:port] connect to lircd at this address -# --count=n send command n times EXAMPLES
irsend LIST DenonTuner "" irsend SEND_ONCE DenonTuner PROG-SCAN irsend SEND_ONCE OnkyoAmpli VOL-UP VOL-UP VOL-UP VOL-UP irsend SEND_START OnkyoAmpli VOL-DOWN ; sleep 3 irsend SEND_STOP OnkyoAmpli VOL-DOWN irsend SET_TRANSMITTERS 1 irsend SET_TRANSMITTERS 1 3 4 irsend SIMULATE "0000000000000476 00 OK TECHNISAT_ST3004S" FILES
/etc/lirc/lircd.conf Default lircd configuration file. It should contain all the remotes, their infra-red codes and the corresponding timing and wave- form details. DIAGNOSTICS
If lircd is not running (or /var/run/lirc/lircd lacks write permissions) irsend aborts with the following diagnostics: "irsend: could not connect to socket" "irsend: Connection refused" (or "Permission denied"). SEE ALSO
The documentation for lirc is maintained as html pages. They are located under html/ in the documentation directory. lircd(8), mode2(1), smode2(1), xmode2(1), irrecord(1), irw(1), http://www.lirc.org. irsend 0.8.7pre1 May 2010 IRSEND(1)
All times are GMT -4. The time now is 02:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy