Sponsored Content
Top Forums Shell Programming and Scripting extracting unique lines from text file Post 302227942 by soliberus on Friday 22nd of August 2008 09:46:23 AM
Old 08-22-2008
extracting unique lines from text file

I have a file with 14million lines and I would like to extract all the unique lines from the file into another text file.

For example:

Contents of file1

happy
sad
smile
happy
funny
sad

I want to run a command against file one that only returns the unique lines (ie 1 line for happy and 1 line for sad).

Could someone please point me in the right direction.

Thanks
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting records with unique fields from a fixed width txt file

Greetings, I would like to extract records from a fixed width text file that have unique field elements. Data is structured like this: John A Smith NY Mary C Jones WA Adam J Clark PA Mary Jones WA Fieldname / start-end position Firstname 1-10... (8 Replies)
Discussion started by: sitney
8 Replies

2. Shell Programming and Scripting

Extracting Text Between Two Unique Lines

Hi all! Im trying to extract a portion of text from a file and put it into a new file. I need all the lines between <Placement> and </Placement> including the Placemark lines themselves. Is there a way to extract all instances of these and not just the first one found? I've tried using sed and... (4 Replies)
Discussion started by: Grizzly
4 Replies

3. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

4. Shell Programming and Scripting

Extracting several lines of text after a unique string

I'm attempting to write a script to identify users who have sudo access on a server. I only want to extract the ID's of the sudo users after a unique line of text. The list of sudo users goes to the EOF so I only need the script to start after the unique line of text. I already have a script to... (1 Reply)
Discussion started by: bouncer
1 Replies

5. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

6. Shell Programming and Scripting

Extracting Multiple Lines from a Text File

Hello. I am sorry if this is a common question but through all my searching, I haven't found an answer which matches what I want to do. I am looking for a sed command that will parse through a large text file and extract lines that start with specific words (which are repeated throughout the... (4 Replies)
Discussion started by: MrDumbQuestion
4 Replies

7. Shell Programming and Scripting

Combine multiple unique lines from event log text file into one line, use PERL or AWK?

I can't decide if I should use AWK or PERL after pouring over these forums for hours today I decided I'd post something and see if I couldn't get some advice. I've got a text file full of hundreds of events in this format: Record Number : 1 Records in Seq : ... (3 Replies)
Discussion started by: Mayday22
3 Replies

8. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

9. Shell Programming and Scripting

Extracting lines from text files in folder based on the numbers in another file

Hello, I have a file ff.txt that looks as follows *ABNA.txt 356 24 36 112 *AC24.txt 457 458 321 2 ABNA.txt and AC24.txt are the files in the folder named foo1. Based on the numbers in the ff.txt file, I want to extract the lines from the corresponding files in the foo1 folder and... (2 Replies)
Discussion started by: mohamad
2 Replies

10. Shell Programming and Scripting

Extracting unique values of a column from a feed file

Hi Folks, I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,. ... (4 Replies)
Discussion started by: punpun66
4 Replies
sad(7D) 							      Devices								   sad(7D)

NAME
sad - STREAMS Administrative Driver SYNOPSIS
#include <sys/types.h> #include <sys/conf.h> #include <sys/sad.h> #include <sys/stropts.h> int ioctl(int fildes, int command, int arg); DESCRIPTION
The STREAMS Administrative Driver provides an interface for applications to perform administrative operations on STREAMS modules and drivers. The interface is provided through ioctl(2) commands. Privileged operations may access the sad driver using /dev/sad/admin. Unprivileged operations may access the sad driver using /dev/sad/user. The fildes argument is an open file descriptor that refers to the sad driver. The command argument determines the control function to be performed as described below. The arg argument represents additional information that is needed by this command. The type of arg depends upon the command, but it is generally an integer or a pointer to a command-specific data structure. COMMAND FUNCTIONS
The autopush facility (see autopush(1M)) allows one to configure a list of modules to be automatically pushed on a stream when a driver is first opened. Autopush is controlled by the following commands: SAD_SAP Allows the administrator to configure the given device's autopush information. arg points to a strapush structure, which con- tains the following members: unit_t ap_cmd; major_t sap_major; minor_t sap_minor; minor_t sap_lastminor; unit_t sap_npush; unit_t sap_list [MAXAPUSH] [FMNAMESZ + 1]; The sap_cmd field indicates the type of configuration being done. It may take on one of the following values: SAP_ONE Configure one minor device of a driver. SAP_RANGE Configure a range of minor devices of a driver. SAP_ALL Configure all minor devices of a driver. SAP_CLEAR Undo configuration information for a driver. The sap_major field is the major device number of the device to be configured. The sap_minor field is the minor device number of the device to be configured. The sap_lastminor field is used only with the SAP_RANGE command, which configures a range of minor devices between sap_minor and sap_lastminor, inclusive. The minor fields have no meaning for the SAP_ALL command. The sap_npush field indicates the number of modules to be automatically pushed when the device is opened. It must be less than or equal to MAXAPUSH , defined in sad.h. It must also be less than or equal to NSTRPUSH, the maximum number of modules that can be pushed on a stream, defined in the kernel master file. The field sap_list is an array of NULL-terminated module names to be pushed in the order in which they appear in the list. When using the SAP_CLEAR command, the user sets only sap_major and sap_minor. This will undo the configuration information for any of the other commands. If a previous entry was configured as SAP_ALL, sap_minor should be set to zero. If a previous entry was configured as SAP_RANGE , sap_minor should be set to the lowest minor device number in the range configured. On failure, errno is set to the following value: EFAULT arg points outside the allocated address space. EINVAL The major device number is invalid, the number of modules is invalid, or the list of module names is invalid. ENOSTR The major device number does not represent a STREAMS driver. EEXIST The major-minor device pair is already configured. ERANGE The command is SAP_RANGE and sap_lastminor is not greater than sap_minor, or the command is SAP_CLEAR and sap_minor is not equal to the first minor in the range. ENODEV The command is SAP_CLEAR and the device is not configured for autopush. ENOSR An internal autopush data structure cannot be allocated. SAD_GAP Allows any user to query the sad driver to get the autopush configuration information for a given device. arg points to a strapush structure as described in the previous command. The user should set the sap_major and sap_minor fields of the strapush structure to the major and minor device numbers, respectively, of the device in question. On return, the strapush structure will be filled in with the entire information used to configure the device. Unused entries in the module list will be zero-filled. On failure, errno is set to one of the following values: EFAULT arg points outside the allocated address space. EINVAL The major device number is invalid. ENOSTR The major device number does not represent a STREAMS driver. ENODEV The device is not configured for autopush. SAD_VML Allows any user to validate a list of modules (that is, to see if they are installed on the system). arg is a pointer to a str_list structure with the following members: int sl_nmods; struct str_mlist *sl_modlist; The str_mlist structure has the following member: char l_name[FMNAMESZ+1]; sl_nmods indicates the number of entries the user has allocated in the array and sl_modlist points to the array of module names. The return value is 0 if the list is valid, 1 if the list contains an invalid module name, or -1 on failure. On failure, errno is set to one of the following values: EFAULT arg points outside the allocated address space. EINVAL The sl_nmods field of the str_list structure is less than or equal to zero. SEE ALSO
Intro(2), ioctl(2), open(2) STREAMS Programming Guide DIAGNOSTICS
Unless otherwise specified, the return value from ioctl() is 0 upon success and -1 upon failure with errno set as indicated. SunOS 5.11 16 Apr 1997 sad(7D)
All times are GMT -4. The time now is 09:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy