07-16-2004
Huge (repeated Entry) text files
Somebody HELP!
I have a huge log file (TEXT) 76298035 bytes.
It's a logfile of IMEIs and IMSIS that I get from my EIR node.
Here is how the contents of the file look like:
000000,
1 33016382000913 652020100423994
1 33016382002353 652020100430743
1 33017035101003 652020100441736
....
....
....
235800,
1 35725620987678 652020100545862
Problem is, the file is to some degree made huge by repeated entries ( repeated lines - non consecutive).
I have tried this code to eliminate the repeated entries:
cat myfile | sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' | tee mynewfile | wc -l
but it takes forever and stops midway, at 024000 instead of 235800.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi expert,
I am using C shell. And i trying to delete repeated word.
Example file.txt:
BLUE
YELLOW
RED
VIOLET
RED
RED
BLUE
WHITE
YELLOW
BLACK
and i wan store the output into a new file:
BLUE (6 Replies)
Discussion started by: vincyoxy
6 Replies
2. Shell Programming and Scripting
Hi,
I need to extract data from a text file in which data has a pattern. I need to extract all repeated pattern and then save it to different files.
example:
input is:
ST*867*000352214
BPT*00*1000352214*090311
SE*1*1
ST*867*000352215
BPT*00*1000352214*090311
SE*1*2
... (5 Replies)
Discussion started by: apjneeraj
5 Replies
3. UNIX for Advanced & Expert Users
I have the following situation:
a text file with 50000 string patterns:
abc2344536
gvk6575556
klo6575556
....
and 3 text files each with more than 1 million lines:
...
000000 abc2344536 46575 0000
000000 abc2344536 46575 4444
000000 abc2344555 46575 1234
...
I... (8 Replies)
Discussion started by: andy2000
8 Replies
4. Shell Programming and Scripting
I have this 2 files:
k5login
sanwar@systems.nyfix.com
jjamnik@systems.nyfix.com
nisha@SYSTEMS.NYFIX.COM
rdpena@SYSTEMS.NYFIX.COM
service/backups-ora@SYSTEMS.NYFIX.COM
ivanr@SYSTEMS.NYFIX.COM
nasapova@SYSTEMS.NYFIX.COM
tpulay@SYSTEMS.NYFIX.COM
rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies
5. Shell Programming and Scripting
I have a text file where I need to find the string = ST*850*
This string is repetaed several times in the file, so I need to know how many times it appears in the file, this is the text files:
ISA*00* *00* *08*925485USNR *ZZ*IMSALADDERSP... (13 Replies)
Discussion started by: cucosss
13 Replies
6. Shell Programming and Scripting
Hi,
I need to correct line breaks for huge files (more than 1MM records in a file) and then format it properly.
Except the header and trailer, each record starts with 'D'.
Requirement:Scan the whole file except the header and trailer records and see if any of the records start with... (19 Replies)
Discussion started by: kikionline
19 Replies
7. Shell Programming and Scripting
Please can you help in providing the most repeated entry in the 2nd column and give its count
Here is an input file
1, This , is a forum
2, This , is a forum
1, There , is a forum
2, This , is not right
Here the most repeated entry is "This" and count is 3
So output... (4 Replies)
Discussion started by: necro98
4 Replies
8. Shell Programming and Scripting
Hi all,
I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated
ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192)
CHST3 (PA26503) docetaxel... (2 Replies)
Discussion started by: manigrover
2 Replies
9. Shell Programming and Scripting
Hi below is the input file, i need to find repeated words and sum up the values of it which is second field from the repeated work.Im trying but getting no where close to it.Kindly give me a hint on how to go about it
Input
fruits,apple,20,fruits,mango,20,veg,carrot,12,veg,raddish,30... (11 Replies)
Discussion started by: 100bees
11 Replies
10. UNIX for Beginners Questions & Answers
Dears
i want to extract lines only that have first entry repeated 3 times or above , ex data :
-bash-3.00$ cat INTCONT-IS.CSV
M205-00-106_AMDRN:1-0-6-22,12-662-4833,intContact,2016-11-15 02:32:16,50
M205-00-106_AMDRN:1-0-23-17,12-616-0462,intContact,2016-11-15 02:32:23,50... (5 Replies)
Discussion started by: is2_egypt
5 Replies
LEARN ABOUT SUSE
drop_text_search_dictionary
DROP TEXT SEARCH
DICTIONARY(7) SQL Commands DROP TEXT SEARCH DICTIONARY(7)
NAME
DROP TEXT SEARCH DICTIONARY - remove a text search dictionary
SYNOPSIS
DROP TEXT SEARCH DICTIONARY [ IF EXISTS ] name [ CASCADE | RESTRICT ]
DESCRIPTION
DROP TEXT SEARCH DICTIONARY drops an existing text search dictionary. To execute this command you must be the owner of the dictionary.
PARAMETERS
IF EXISTS
Do not throw an error if the text search dictionary does not exist. A notice is issued in this case.
name The name (optionally schema-qualified) of an existing text search dictionary.
CASCADE
Automatically drop objects that depend on the text search dictionary.
RESTRICT
Refuse to drop the text search dictionary if any objects depend on it. This is the default.
EXAMPLES
Remove the text search dictionary english:
DROP TEXT SEARCH DICTIONARY english;
This command will not succeed if there are any existing text search configurations that use the dictionary. Add CASCADE to drop such con-
figurations along with the dictionary.
COMPATIBILITY
There is no DROP TEXT SEARCH DICTIONARY statement in the SQL standard.
SEE ALSO
ALTER TEXT SEARCH DICTIONARY [alter_text_search_dictionary(7)], CREATE TEXT SEARCH DICTIONARY [create_text_search_dictionary(7)]
SQL - Language Statements 2010-05-14 DROP TEXT SEARCH DICTIONARY(7)