Sponsored Content
Top Forums UNIX for Dummies Questions & Answers split a file with unique sets Post 302250091 by ChicagoBlues on Wednesday 22nd of October 2008 05:12:54 PM
Old 10-22-2008
I was hoping for a better solution, but here is a crude way that i thought of:

1. split the file 'n' ways (n=3 for this example):

part 1 part 2 part 3
1 2 3
1 2 4
1 3 5

2. if n%(size of orig file) = 3%10 > 0 then append remaining id to the last partition

part 3
3
4
5
5

3. Compare part 1 with part 2 and see if ids are matched. If found, then move row from part 2 to part 1. Move to the next part and do the same.

part 1
1
1
1

part 2
2
2
3
3

part 3
3
4
5
5
Hopefully, someone will present a sleeker solution with some syntax.

Thanks,

- CB
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

FILE SETS in unix

Hi all, Pls. let me know whether there is any concept called "FILE SETS" in unix? Because, I am using ETL tool DataStage which creates FILE SETS. While I am able to view the data of such a file set in the tool, the "cat" command on this FILESET lists only the Metadata and not the data content... (2 Replies)
Discussion started by: Aparna_A
2 Replies

2. AIX

IP Security file sets

hello, we are implementing ip security on several of our aix 5.2-09 boxes and i am unable to locate the prerequisite file sets. does anyone know where i can find these? i have the original 5.2 cd's but these file sets are not on any of the cd's. Any thoughts or suggestions? (3 Replies)
Discussion started by: zuessh
3 Replies

3. Virtualization and Cloud Computing

Clouds (Partially Order Sets) - Streams (Linearly Ordered Sets) - Part 2

timbass Sat, 28 Jul 2007 10:07:53 +0000 Originally posted in Yahoo! CEP-Interest Here is my follow-up note on posets (partially ordered sets) and tosets (totally or linearly ordered sets) as background set theory for event processing, and in particular CEP and ESP. In my last note, we... (0 Replies)
Discussion started by: Linux Bot
0 Replies

4. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ... (5 Replies)
Discussion started by: andrewsc
5 Replies

5. Shell Programming and Scripting

sort split merge -u unique

Hi, this is about sorting a very large file (like 10 gb) to keep lines with unique entries across SOME of the columns. The line originally looked like this: sort -u -k2,2 -k3,3n -k4,4n -k5,5n -k6,6n file_unsorted > file_sorted please note the -u flag. The problem is that this single... (4 Replies)
Discussion started by: jbr950
4 Replies

6. Shell Programming and Scripting

Change unique file names into new unique filenames

I have 84 files with the following names splitseqs.1, spliseqs.2 etc. and I want to change the .number to a unique filename. E.g. change splitseqs.1 into splitseqs.7114_1#24 and change spliseqs.2 into splitseqs.7067_2#4 So all the current file names are unique, so are the new file names.... (1 Reply)
Discussion started by: avonm
1 Replies

7. Shell Programming and Scripting

Identifying dupes within a database and creating unique sub-sets

Hello, I have a database of name variants with the following structure: variant=variant=variant The number of variants can be as many as thirty to forty. Since the database is quite large (at present around 60,000 lines) duplicate sets of variants creep in. Thus John=Johann=Jon and... (2 Replies)
Discussion started by: gimley
2 Replies

8. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies

9. UNIX for Beginners Questions & Answers

Split into multiple files by using Unique columns in a UNIX file

I have requirement to split below file (sample.csv) into multiple files by using the unique columns (first 3 are unique columns) sample.csv 123|22|56789|ABCDEF|12AB34|2019-07-10|2019-07-10|443.3400|1|1 123|12|5679|BCDEFG|34CD56|2019-07-10|2019-07-10|896.7200|1|2... (3 Replies)
Discussion started by: RVSP
3 Replies
textutil::split(n)				    Text and string utilities, macro processing 				textutil::split(n)

__________________________________________________________________________________________________________________________________________________

NAME
textutil::split - Procedures to split texts SYNOPSIS
package require Tcl 8.2 package require textutil::split ?0.7? ::textutil::split::splitn string ?len? ::textutil::split::splitx string ?regexp? _________________________________________________________________ DESCRIPTION
The package textutil::split provides commands that split strings by size and arbitrary regular expressions. The complete set of procedures is described below. ::textutil::split::splitn string ?len? This command splits the given string into chunks of len characters and returns a list containing these chunks. The argument len defaults to 1 if none is specified. A negative length is not allowed and will cause the command to throw an error. Providing an empty string as input is allowed, the command will then return an empty list. If the length of the string is not an entire multiple of the chunk length, then the last chunk in the generated list will be shorter than len. ::textutil::split::splitx string ?regexp? This command splits the string and return a list. The string is split according to the regular expression regexp instead of a simple list of chars. Note that if you parentheses are added into the regexp, the parentheses part of separator will be added into the result list as additional element. If the string is empty the result is the empty list, like for split. If regexp is empty the string is split at every character, like split does. The regular expression regexp defaults to "[\t \r\n]+". BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category textutil of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. SEE ALSO
regexp(n), split(n), string(n) KEYWORDS
regular expression, split, string CATEGORY
Text processing textutil 0.7 textutil::split(n)
All times are GMT -4. The time now is 11:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy