Sponsored Content
Full Discussion: Remove dupes in a large file
Top Forums Shell Programming and Scripting Remove dupes in a large file Post 303024652 by MadeInGermany on Sunday 14th of October 2018 04:45:14 AM
Old 10-14-2018
Exactly, X[$0]++ holds a number value; i.e. each new line consumes a number's space.
This User Gave Thanks to MadeInGermany For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove a large number of user from oracle

Hi on solaris and oracle 10g2, I have number of users created in Oracle, I wonder if I have a list of the usernames will it be possible to remove the users quickly ? I want to keep the users access to system but oracle. some thing like shell script may be ?:confused: I am trying to... (4 Replies)
Discussion started by: upengan78
4 Replies

2. Shell Programming and Scripting

Sed or awk script to remove text / or perform calculations from large CSV files

I have a large CSV files (e.g. 2 million records) and am hoping to do one of two things. I have been trying to use awk and sed but am a newbie and can't figure out how to get it to work. Any help you could offer would be greatly appreciated - I'm stuck trying to remove the colon and wildcards in... (6 Replies)
Discussion started by: metronomadic
6 Replies

3. Shell Programming and Scripting

remove a specific line in a LARGE file

Hi guys, i have a really big file, and i want to remove a specific line. sed -i '5d' fileThis doesn't really work, it takes a lot of time... The whole script is supposed to remove every word containing less than 5 characters and currently looks like this: #!/bin/bash line="1"... (2 Replies)
Discussion started by: blubbiblubbkekz
2 Replies

4. Shell Programming and Scripting

Remove Duplicate Filenames in 2 very large directories

Hello Gurus, O/S RHEL4 I have a requirement to compare two linux based directories for duplicate filenames and remove them. These directories are close to 2 TB each. I have tried running a: Prompt>diff -r data1/ data2/ I have tried this as well: jason@jason-desktop:~$ cat script.sh ... (7 Replies)
Discussion started by: jaysunn
7 Replies

5. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies

6. UNIX for Dummies Questions & Answers

Filtering F-Dupes

Is there an easy way to tell FDupes what filetypes to look at or ignore? (0 Replies)
Discussion started by: furashgf
0 Replies

7. Shell Programming and Scripting

Removing Dupes from huge file- awk/perl/uniq

Hi, I have the following command in place nawk -F, '!a++' file > file.uniq It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error: bash-3.2$ nawk -F, '!a++'... (17 Replies)
Discussion started by: makn
17 Replies

8. Shell Programming and Scripting

remove large portion of web page code between two tags

Hi everybody, I am trying to remove bunch of lines from web pages between two tags: one is <h1> and the other is <table it looks like <h1>Anniversary cards roses</h1> many lines here <table summary="Free anniversary greeting cards." cellspacing="8" cellpadding="8" width="70%">my goal... (5 Replies)
Discussion started by: georgi58
5 Replies

9. Shell Programming and Scripting

Removing dupes within 2 delimited areas in a large dictionary file

Hello, I have a very large dictionary file which is in text format and which contains a large number of sub-sections. Each sub-section starts with the following header : #DATA #VALID 1 and ends with a footer as shown below #END The data between the Header and the Footer consists of... (6 Replies)
Discussion started by: gimley
6 Replies

10. Shell Programming and Scripting

Modify script to remove dupes with two delimiters

Hello, I have a script which removes duplicates in a database with a single delimiter = The script is given below: # script to remove dupes from a row with structure word=word BEGIN{FS="="} {for(i=1;i<=NF;i++){a++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1 How do I modify... (6 Replies)
Discussion started by: gimley
6 Replies
uuencode(5)							File Formats Manual						       uuencode(5)

Name
       uuencode - format of an encoded uuencode file

Description
       Files  output by consist of a header line, followed by a number of body lines, and a trailer line.  The command ignores any lines preceding
       the header or following the trailer.  Lines preceding a header must not, of course, look like a header.

       The header line is distinguished by having the first six characters by the word ``begin'', followed by a space.	The next item on the  line
       is a mode (in octal) and a string which names the remote file.  A space separates the three items in the header line.

       The  body  consists  of	a  number of lines, each at most 62 characters long including the trailing new line.  These consist of a character
       count, followed by encoded characters, followed by a new line.  The character count is a single printing character and represents an  inte-
       ger, the number of bytes the rest of the line represents.  Such integers are always in the range from 0 to 63 and can be determined by sub-
       tracting the character space (octal 40) from the character.

       Groups of 3 bytes are stored in 4 characters, with 6 bits per character.  All are offset by a space to make the characters print.  The last
       line may be shorter than the normal 45 bytes.  If the size is not a multiple of 3, this fact can be determined by the value of the count on
       the last line.  Extra dummy characters are included to make the character count a multiple of 4.  The body is terminated by a line  with  a
       count of zero.  This line consists of one ASCII space.

       The trailer line consists of "end" on a line by itself.

See Also
       mail(1), uucp(1c), uudecode(1c), uuencode(1c), uusend(1c)

																       uuencode(5)
All times are GMT -4. The time now is 07:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy