split a file with unique sets Post: 302250116

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

FILE SETS in unix

Hi all, Pls. let me know whether there is any concept called "FILE SETS" in unix? Because, I am using ETL tool DataStage which creates FILE SETS. While I am able to view the data of such a file set in the tool, the "cat" command on this FILESET lists only the Metadata and not the data content...

2. AIX

IP Security file sets

hello, we are implementing ip security on several of our aix 5.2-09 boxes and i am unable to locate the prerequisite file sets. does anyone know where i can find these? i have the original 5.2 cd's but these file sets are not on any of the cd's. Any thoughts or suggestions?

3. Virtualization and Cloud Computing

Clouds (Partially Order Sets) - Streams (Linearly Ordered Sets) - Part 2

timbass Sat, 28 Jul 2007 10:07:53 +0000 Originally posted in Yahoo! CEP-Interest Here is my follow-up note on posets (partially ordered sets) and tosets (totally or linearly ordered sets) as background set theory for event processing, and in particular CEP and ESP. In my last note, we...

4. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ...

5. Shell Programming and Scripting

sort split merge -u unique

Hi, this is about sorting a very large file (like 10 gb) to keep lines with unique entries across SOME of the columns. The line originally looked like this: sort -u -k2,2 -k3,3n -k4,4n -k5,5n -k6,6n file_unsorted > file_sorted please note the -u flag. The problem is that this single...

6. Shell Programming and Scripting

Change unique file names into new unique filenames

I have 84 files with the following names splitseqs.1, spliseqs.2 etc. and I want to change the .number to a unique filename. E.g. change splitseqs.1 into splitseqs.7114_1#24 and change spliseqs.2 into splitseqs.7067_2#4 So all the current file names are unique, so are the new file names....

7. Shell Programming and Scripting

Identifying dupes within a database and creating unique sub-sets

Hello, I have a database of name variants with the following structure: variant=variant=variant The number of variants can be as many as thirty to forty. Since the database is quite large (at present around 60,000 lines) duplicate sets of variants creep in. Thus John=Johann=Jon and...

8. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +...

9. UNIX for Beginners Questions & Answers

Split into multiple files by using Unique columns in a UNIX file

I have requirement to split below file (sample.csv) into multiple files by using the unique columns (first 3 are unique columns) sample.csv 123|22|56789|ABCDEF|12AB34|2019-07-10|2019-07-10|443.3400|1|1 123|12|5679|BCDEFG|34CD56|2019-07-10|2019-07-10|896.7200|1|2...

LEARN ABOUT FREEBSD

split

SPLIT(1)						    BSD General Commands Manual 						  SPLIT(1)

NAME

     split -- split a file into pieces

SYNOPSIS

     split -d [-l line_count] [-a suffix_length] [file [prefix]]
     split -d -b byte_count[K|k|M|m|G|g] [-a suffix_length] [file [prefix]]
     split -d -n chunk_count [-a suffix_length] [file [prefix]]
     split -d -p pattern [-a suffix_length] [file [prefix]]

DESCRIPTION

     The split utility reads the given file and breaks it up into files of 1000 lines each (if no options are specified), leaving the file
     unchanged.  If file is a single dash ('-') or absent, split reads from the standard input.

     The options are as follows:

     -a suffix_length
	     Use suffix_length letters to form the suffix of the file name.

     -b byte_count[K|k|M|m|G|g]
	     Create split files byte_count bytes in length.  If k or K is appended to the number, the file is split into byte_count kilobyte
	     pieces.  If m or M is appended to the number, the file is split into byte_count megabyte pieces.  If g or G is appended to the num-
	     ber, the file is split into byte_count gigabyte pieces.

     -d      Use a numeric suffix instead of a alphabetic suffix.

     -l line_count
	     Create split files line_count lines in length.

     -n chunk_count
	     Split file into chunk_count smaller files.

     -p pattern
	     The file is split whenever an input line matches pattern, which is interpreted as an extended regular expression.	The matching line
	     will be the first line of the next output file.  This option is incompatible with the -b and -l options.

     If additional arguments are specified, the first is used as the name of the input file which is to be split.  If a second additional argument
     is specified, it is used as a prefix for the names of the files into which the file is split.  In this case, each file into which the file is
     split is named by the prefix followed by a lexically ordered suffix using suffix_length characters in the range ``a-z''.  If -a is not speci-
     fied, two letters are used as the suffix.

     If the prefix argument is not specified, the file is split into lexically ordered files named with the prefix ``x'' and with suffixes as
     above.

ENVIRONMENT

     The LANG, LC_ALL, LC_CTYPE and LC_COLLATE environment variables affect the execution of split as described in environ(7).

EXIT STATUS

     The split utility exits 0 on success, and >0 if an error occurs.

SEE ALSO

     csplit(1), re_format(7)

STANDARDS

     The split utility conforms to IEEE Std 1003.1-2001 (``POSIX.1'').

HISTORY

     A split command appeared in Version 3 AT&T UNIX.

BUGS

     The maximum line length for matching patterns is 65536.

BSD
								    May 9, 2013 							       BSD