Sponsored Content
Top Forums Shell Programming and Scripting Find unique lines based off of bytes Post 302853209 by apmcd47 on Friday 13th of September 2013 11:46:28 AM
Old 09-13-2013
This appears to work for the example text:
Code:
fgrep -v -f<(sed 's/^._\(id[0-9][0-9]*\).*$/\1/' < comp.txt  | sort | uniq -d) comp.txt

so
Code:
fgrep -v -f file input

lists lines from input where they don't match the lines in file (each line is a substring, obviously)
Code:
<(...)

is a process substitution allowing another command to be used
instead of a file
Code:
sed 's/^._\(id[0-9][0-9]*\).*$/\1/'

gives us the list of _idxx
Code:
uniq -d

lists duplicated lines (uniq lines are thrown away).

Does this work for you?

Andrew
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

2. Shell Programming and Scripting

parsing file based on characters/bytes

I have a datafile that is formatted as fixed. I know that each line should contain 880 characters. I want to separate the file into 2 files, one that has lines with 880 characters and the other file with everything else. Is this possible ? (9 Replies)
Discussion started by: cheeko111
9 Replies

3. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

4. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

5. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

6. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

7. Shell Programming and Scripting

Transpose lines from individual blocks to unique lines

Hello to all, happy new year 2013! May somebody could help me, is about a very similar problem to the problem I've posted here where the member rdrtx1 and bipinajith helped me a lot. https://www.unix.com/shell-programming-scripting/211147-map-values-blocks-single-line-2.html It is very... (3 Replies)
Discussion started by: Ophiuchus
3 Replies

8. Shell Programming and Scripting

Based on column in file1, find match in file2 and print matching lines

file1: file2: I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading): This is honestly beyond what I can do without spending the whole night on it, so I'm... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

9. UNIX for Dummies Questions & Answers

Print unique lines without sort or unique

I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Discussion started by: cokedude
7 Replies

10. UNIX for Beginners Questions & Answers

Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt: PS003,001 MZMWR/ L-DWD// * PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies
fgrep(1)							   User Commands							  fgrep(1)

NAME
fgrep - search a file for a fixed-character string SYNOPSIS
/usr/bin/fgrep [-bchilnsvx] -e pattern_list [file...] /usr/bin/fgrep [-bchilnsvx] -f file [file...] /usr/bin/fgrep [-bchilnsvx] pattern [file...] /usr/xpg4/bin/fgrep [-bchilnqsvx] -e pattern_list [-f file] [file...] /usr/xpg4/bin/fgrep [-bchilnqsvx] [-e pattern_list] -f file [file...] /usr/xpg4/bin/fgrep [-bchilnqsvx] pattern [file...] DESCRIPTION
The fgrep (fast grep) utility searches files for a character string and prints all lines that contain that string. fgrep is different from grep(1) and from egrep(1) because it searches for a string, instead of searching for a pattern that matches an expression. fgrep uses a fast and compact algorithm. The characters $, *, [, ^, |, (, ), and are interpreted literally by fgrep, that is, fgrep does not recognize full regular expressions as does egrep. These characters have special meaning to the shell. Therefore, to be safe, enclose the entire string within single quotes (a'). If no files are specified, fgrep assumes standard input. Normally, each line that is found is copied to the standard output. The file name is printed before each line that is found if there is more than one input file. OPTIONS
The following options are supported for both /usr/bin/fgrep and /usr/xpg4/bin/fgrep: -b Precedes each line by the block number on which the line was found. This can be useful in locating block numbers by con- text. The first block is 0. -c Prints only a count of the lines that contain the pattern. -e pattern_list Searches for a string in pattern-list. This is useful when the string begins with a -. -f pattern-file Takes the list of patterns from pattern-file. -h Suppresses printing of files when searching multiple files. -i Ignores upper/lower case distinction during comparisons. -l Prints the names of files with matching lines once, separated by new-lines. Does not repeat the names of files when the pattern is found more than once. -n Precedes each line by its line number in the file. The first line is 1. -s Works silently, that is, displays nothing except error messages. This is useful for checking the error status. -v Prints all lines except those that contain the pattern. -x Prints only lines that are matched entirely. /usr/xpg4/bin/fgrep The following options are supported for /usr/xpg4/bin/fgrep only: -q Quiet. Does not write anything to the standard output, regardless of matching lines. Exits with zero status if an input line is selected. OPERANDS
The following operands are supported: file Specifies a path name of a file to be searched for the patterns. If no file operands are specified, the standard input will be used. /usr/bin/fgrep pattern Specifies a pattern to be used during the search for input. /usr/xpg4/bin/fgrep pattern Specifies one or more patterns to be used during the search for input. This operand is treated as if it were specified as -e pattern_list. USAGE
See largefile(5) for the description of the behavior of fgrep when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of fgrep: LC_COLLATE, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 If any matches are found 1 If no matches are found 2 For syntax errors or inaccessible files, even if matches were found. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: /usr/bin/fgrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ /usr/xpg4/bin/fgrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWxcu4 | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ SEE ALSO
ed(1), egrep(1), grep(1), sed(1), sh(1), attributes(5), environ(5), largefile(5), XPG4(5) NOTES
Ideally, there should be only one grep command, but there is not a single algorithm that spans a wide enough range of space-time tradeoffs. Lines are limited only by the size of the available virtual memory. /usr/xpg4/bin/fgrep The /usr/xpg4/bin/fgrep utility is identical to /usr/xpg4/bin/grep -F (see grep(1)). Portable applications should use /usr/xpg4/bin/grep -F. SunOS 5.11 24 Mar 2006 fgrep(1)
All times are GMT -4. The time now is 02:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy