Sponsored Content
Top Forums UNIX for Advanced & Expert Users How to split large file with different record delimiter? Post 302990685 by rdrtx1 on Monday 30th of January 2017 06:00:24 PM
Old 01-30-2017
Code:
awk 'NR==FNR {records=NR; next}
FNR==1 {
   Split=(records % Split) ? (int(records/Split)+1) : (records/Split);
   split_file="split_file1";
}
{
 printf $0 RS > split_file;
 if (! (FNR % Split)) {
   if (split_file) close(split_file);
   split_file="split_file" 1 + ++file_c;
 }
}
' RS="[|]##[|]" datafile Split=4 datafile

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split A Large File

Hi, I have a large file(csv format) that I need to split into 2 files. The file looks something like Original_file.txt first name, family name, address a, b, c, d, e, f, and so on for over 100,00 lines I need to create two files from this one file. The condition is i need to ensure... (4 Replies)
Discussion started by: nbvcxzdz
4 Replies

2. Shell Programming and Scripting

Pivot variable record length file and change delimiter

Hi experts. I got a file (500mb max) and need to pivot it (loading into ORCL) and change BLANK delimiter to PIPE |. Sometimes there are multipel BLANKS (as a particular value may be BLANK, or simply two BLANKS instead of one BLANK). thanks for your input! Cheers, Layout... (3 Replies)
Discussion started by: thomasr
3 Replies

3. Shell Programming and Scripting

Split Large File

HI, i've to split a large file which inputs seems like : Input file name_file.txt 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00003|CCCC|MAIL|DATEOFBIRTHT|.......... (1 Reply)
Discussion started by: AMARA
1 Replies

4. Shell Programming and Scripting

split record based on delimiter

Hi, My inputfile contains field separaer is ^. 12^inms^ 13^fakdks^ssk^s3 23^avsd^ 13^fakdks^ssk^a4 I wanted to print only 2 delimiter occurence i.e 12^inms^ 23^avsd^ (4 Replies)
Discussion started by: Jairaj
4 Replies

5. Shell Programming and Scripting

How to delete 1 record in large file!

Hi All, I'm a newbie here, I'm just wondering on how to delete a single record in a large file in unix. ex. file1.txt is 1000 records nikki1 nikki2 nikki3 what i want to do is delete the nikki2 record in file1.txt. is it possible? Please advise, Thanks, (3 Replies)
Discussion started by: nikki1200
3 Replies

6. Shell Programming and Scripting

split file by delimiter with csplit

Hello, I want to split a big file into smaller ones with certain "counts". I am aware this type of job has been asked quite often, but I posted again when I came to csplit, which may be simpler to solve the problem. Input file (fasta format): >seq1 agtcagtc agtcagtc ag >seq2 agtcagtcagtc... (8 Replies)
Discussion started by: yifangt
8 Replies

7. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

8. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Discussion started by: ibmtech
20 Replies

9. Shell Programming and Scripting

How to target certain delimiter to split text file?

Hi, all. I have an input file. I would like to generate 3 types of output files. Input: LG10_PM_map_19_LEnd_1000560 LG10_PM_map_6-1_27101856 LG10_PM_map_71_REnd_20597718 LG12_PM_map_5_chr_118419232 LG13_PM_map_121_24341052 LG14_PM_1a_456799 LG1_MM_scf_5a_opt_abc_9029993 ... (5 Replies)
Discussion started by: huiyee1
5 Replies

10. Shell Programming and Scripting

How to check record delimiter of a file ?

My requirment is for every record of a particular file I've to check for a record delimeter (e.g. "\n") and if any row doesn't have "\n" then report it in error file . Please suggest me to go through this. (4 Replies)
Discussion started by: manab86
4 Replies
split(1)							   User Commands							  split(1)

NAME
split - split a file into pieces SYNOPSIS
split [-linecount | -l linecount] [-a suffixlength] [file [name]] split [-b n | nk | nm] [-a suffixlength] [file [name]] DESCRIPTION
The split utility reads file and writes it in linecount-line pieces into a set of output-files. The name of the first output-file is name with aa appended, and so on lexicographically, up to zz (a maximum of 676 files). The maximum length of name is 2 characters less than the maximum filename length allowed by the filesystem. See statvfs(2). If no output name is given, x is used as the default (output-files will be called xaa, xab, and so forth). OPTIONS
The following options are supported: -linecount | -l linecount Number of lines in each piece. Defaults to 1000 lines. -a suffixlength Uses suffixlength letters to form the suffix portion of the filenames of the split file. If -a is not specified, the default suffix length is 2. If the sum of the name operand and the suffixlength option-argument would create a filename exceeding NAME_MAX bytes, an error will result; split will exit with a diagnostic message and no files will be created. -b n Splits a file into pieces n bytes in size. -b nk Splits a file into pieces n*1024 bytes in size. -b nm Splits a file into pieces n*1048576 bytes in size. OPERANDS
The following operands are supported: file The path name of the ordinary file to be split. If no input file is given or file is -, the standard input will be used. name The prefix to be used for each of the files resulting from the split operation. If no name argument is given, x will be used as the prefix of the output files. The combined length of the basename of prefix and suffixlength cannot exceed NAME_MAX bytes. See OPTIONS. USAGE
See largefile(5) for the description of the behavior of split when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of split: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Committed | +-----------------------------+-----------------------------+ |Standard |See standards(5). | +-----------------------------+-----------------------------+ SEE ALSO
csplit(1), statvfs(2), attributes(5), environ(5), largefile(5), standards(5) SunOS 5.11 16 Apr 1999 split(1)
All times are GMT -4. The time now is 12:20 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy