Sponsored Content
Top Forums Shell Programming and Scripting Split a large file in n records and skip a particular record Post 302877390 by Akshay Hegde on Saturday 30th of November 2013 02:12:05 PM
Old 11-30-2013
Quote:
Originally Posted by Chubler_XL
The problem is that all the conditions must be true to change filename so if record 5000 starts with "3" its not tested again on record 5001.

My solution failed because record 1 had "3" so no starting filename was set.
My understanding
Corona's code
Code:
awk 'BEGIN{x="F"++i } NR%5==1{N++} N&&!/^\s*3/{if(x) close(x);x="Fa"++i;N=0}{print > x}'

1.It sets x in beginning
2.when NR becomes 6 remainder will be 1 and N will be incremented
3. if N is set and line doesn't start with 3 is true, it checks whether x is set or not, if x is set close x, increment i x will be the new file, reset N.
4. Last write line to file x

My code
Code:
awk 'NR==1 || NR % 5000 == 1 && !/^\s*3/{close(f);f="File_"++i".tmp"}{print >f}' file

1. NR == 1 , close f, since f is not set, no effect on close(f), increment i thats 1 and f will be the name of file., instead of BEGIN block I used NR==1
2. when NR becomes 5001 remainder will be 1, and check whether line starts with digit 3 if not close f, increment i, file name will be changed
3. write line to file f

let me know if my understanding is wrong.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split A Large File

Hi, I have a large file(csv format) that I need to split into 2 files. The file looks something like Original_file.txt first name, family name, address a, b, c, d, e, f, and so on for over 100,00 lines I need to create two files from this one file. The condition is i need to ensure... (4 Replies)
Discussion started by: nbvcxzdz
4 Replies

2. Shell Programming and Scripting

Split Large File

HI, i've to split a large file which inputs seems like : Input file name_file.txt 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00001|AAAA|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00002|BBBB|MAIL|DATEOFBIRTHT|....... 00003|CCCC|MAIL|DATEOFBIRTHT|.......... (1 Reply)
Discussion started by: AMARA
1 Replies

3. Shell Programming and Scripting

Split a large file

I have a 3 GB text file that I would like to split. How can I do this? It's a giant comma-separated list of numbers. I would like to make it into about 20 files of ~100 MB each, with a custom header and footer. The file can only be split on commas, but they're plentiful. Something like... (3 Replies)
Discussion started by: CRGreathouse
3 Replies

4. Shell Programming and Scripting

Split a single record to multiple records & add folder name to each line

Hi Gurus, I need to cut single record in the file(asdf) to multile records based on the number of bytes..(44 characters). So every record will have 44 characters. All the records should be in the same file..to each of these lines I need to add the folder(<date>) name. I have a dir. in which... (20 Replies)
Discussion started by: ram2581
20 Replies

5. Shell Programming and Scripting

How to delete 1 record in large file!

Hi All, I'm a newbie here, I'm just wondering on how to delete a single record in a large file in unix. ex. file1.txt is 1000 records nikki1 nikki2 nikki3 what i want to do is delete the nikki2 record in file1.txt. is it possible? Please advise, Thanks, (3 Replies)
Discussion started by: nikki1200
3 Replies

6. UNIX for Dummies Questions & Answers

Split single record to multiple records

Hi Friends, source .... col1,col2,col3 a,b,1;2;3 here colom delimeter is comma(,). here we dont know what is the max length of col3 means now we have 1;2;3 next time i will receive 1;2;3;4;5;etc... required output .............. col1,col2,col3 a,b,1 a,b,2 a,b,3 please give me... (5 Replies)
Discussion started by: bab.galary
5 Replies

7. UNIX for Dummies Questions & Answers

Using awk to skip record in file

I need to amend the code blow such that it reads a "black list" before the "print" statement; if "substr($1,1,6)" is found in the "blacklist" it will ignore that record and continue. the code is from an awk script that is being called from shell script which passes the input values. BEGIN { "date... (5 Replies)
Discussion started by: bazel
5 Replies

8. Shell Programming and Scripting

How to split one record to multiple records?

Hi, I have one tab delimited file which is having multiple store_ids in first column seprated by pipe.I want to split the file on the basis of store_id(separating 1st record in to 2 records ). I tried some more options like below with using split,awk etc ,But not able to get proper output. can... (1 Reply)
Discussion started by: jaggy
1 Replies

9. UNIX for Advanced & Expert Users

How to split large file with different record delimiter?

Hi, I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues. If the record delimiter is unix new line, I could use split command either with option l or b. The problem is that the line terminator is |##| How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies

10. UNIX for Beginners Questions & Answers

Trying To Split a Large File

Trying to split a 35gb file into 1000mb parts. My research shows I should you this. split -b 1000m file.txt and my return is "split: cannot open 'crunch1.txt' for reading: No such file or directory" so I tried split -b 1000m Documents/Wordlists/file.txt and I get nothing other than the curser just... (3 Replies)
Discussion started by: sub terra
3 Replies
close(n)						       Tcl Built-In Commands							  close(n)

__________________________________________________________________________________________________________________________________________________

NAME
close - Close an open channel SYNOPSIS
close channelId _________________________________________________________________ DESCRIPTION
Closes the channel given by channelId. ChannelId must be an identifier for an open channel such as a Tcl standard channel (stdin, stdout, or stderr), the return value from an invocation of open or socket, or the result of a channel creation command provided by a Tcl extension. All buffered output is flushed to the channel's output device, any buffered input is discarded, the underlying file or device is closed, and channelId becomes unavailable for use. If the channel is blocking, the command does not return until all output is flushed. If the channel is nonblocking and there is unflushed output, the channel remains open and the command returns immediately; output will be flushed in the background and the channel will be closed when all the flushing is complete. If channelId is a blocking channel for a command pipeline then close waits for the child processes to complete. If the channel is shared between interpreters, then close makes channelId unavailable in the invoking interpreter but has no other effect until all of the sharing interpreters have closed the channel. When the last interpreter in which the channel is registered invokes close, the cleanup actions described above occur. See the interp command for a description of channel sharing. Channels are automatically closed when an interpreter is destroyed and when the process exits. Channels are switched to blocking mode, to ensure that all output is correctly flushed before the process exits. The command returns an empty string, and may generate an error if an error occurs while flushing output. If a command in a command pipe- line created with open returns an error, close generates an error (similar to the exec command.) EXAMPLE
This illustrates how you can use Tcl to ensure that files get closed even when errors happen by combining catch, close and return: proc withOpenFile {filename channelVar script} { upvar 1 $channelVar chan set chan [open $filename] catch { uplevel 1 $script } result options close $chan return -options $options $result } SEE ALSO
file(n), open(n), socket(n), eof(n), Tcl_StandardChannels(3) KEYWORDS
blocking, channel, close, nonblocking Tcl 7.5 close(n)
All times are GMT -4. The time now is 07:52 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy