Split a fixed length file bases on last occurence of string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split a fixed length file bases on last occurence of string
# 1  
Old 07-18-2013
Question Split a fixed length file bases on last occurence of string

Hi,

I need to split a file based on last occurece of a string. PFB the explanation
I have a file in following format
Code:
aaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbb
ccccccccccccccccccccccccccc
ddddddddddddddddddddddddddd
3186rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
aaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaa
3186ppppppppppppppppppppppp
fffffffffffffffffffffffffffffffffffffffffff
fffffffffffffffffffffffffffffffffffffffffff
9876ttttttttttttttttttttttttttttttt
kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
ppppppppppppppppppppppppppp
yyyyyyyyyyyyyyyyyyyyyyyyyyy
9876vvvvvvvvvvvvvvvvvvvvvvv

Now i need to split this file such that all the lines before the last occurence of 3186 goes to one file and all the line after that till last occurence of 9876 should go to second file. so there should be two files one with following data:
Code:
aaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbb
ccccccccccccccccccccccccccc
ddddddddddddddddddddddddddd
3186rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
aaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaa
3186ppppppppppppppppppppppp

and second file should contain:
Code:
fffffffffffffffffffffffffffffffffffffffffff
fffffffffffffffffffffffffffffffffffffffffff
9876ttttttttttttttttttttttttttttttt
kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
ppppppppppppppppppppppppppp
yyyyyyyyyyyyyyyyyyyyyyyyyyy
9876vvvvvvvvvvvvvvvvvvvvvvv

Kindly help in acheiving this.

Last edited by Scrutinizer; 07-18-2013 at 12:40 AM.. Reason: code tags
# 2  
Old 07-18-2013
You could try this approach:
Code:
awk 'NR==FNR{if(/^3186/)n=NR; next} FNR==1,FNR==n{print > "FileA"; next}1' file file >fileB

The input file is read twice, first to determine the last occurrence and the second time to make the split...
# 3  
Old 07-18-2013
What are the names of the three files?

Do the 3186 and 9876 only appear at the start of a line, or can they appear anywhere in a line?

How big are these files?

Are the filenames and matching patterns constants, are do you want to pass them as arguments to a shell script?
# 4  
Old 07-18-2013
Hi
The files are very big with over 1.5 Lakh records. The matching patterns are of fixed length of 8 characters, which i have to read from a file.
For each line of that file, i will search in this file which needs to splitted.
# 5  
Old 07-18-2013
You didn't say what your filenames are??? You didn't say whether the patterns appear only at the start of lines or appear anywhere in lines in your input file???

Making lots of wild assumptions:
  1. The patterns you're trying to match don't contain any characters that are "special" in a regular expression.
  2. The patterns you're trying to match don't contain any whitespace characters.
  3. The patterns you're trying to match don't contain any question mark characters.
  4. The patterns you're trying to match only appear at the start of a line in your input file.
  5. The patterns you want to match appear on the first line of a file named patterns and are separated by one or more space or tab characters.
  6. The names of the input file and both output files will be passed to this script as operands 1, 2, and 3, respectively (and default to files named input, out1, and out2 if operands are not given to the script).
  7. Your output filenames do not contain any whitespace characters.
  8. The last line in your input file matching the 1st pattern appears earlier in the input file than the last line matching the 2nd pattern.
If any of these assumptions are incorrect, the following script may need to be modified to make it work. But, if all of these assumptions are correct, the following script produces the output you requested if the input file contains the sample input in your 1st messsage in this thread and the file patterns contains:
Code:
3186 9876

as the first two fields on the first line:
Code:
#!/bin/ksh
f1=${1:-input}
f2=${2:-out1}
f3=${3:-out2}
read pat1 pat2 junk < patterns
ed -s "$f1" <<-END_ED
        1;?^$pat1?ka
        1;?^$pat2?kb
        1,'aw $f2
        'a+1,'bw $f3
        q
END_ED

This was written and tested using the Korn shell, but should work fine with any other shell that recognizes basic Bourne shell syntax or the shell syntax specified by the POSIX Standards and the Single UNIX Specifications.

The awk script Scrutinizer provided makes some of the above assumptions and also assumes that your second pattern can be found on the last line in your input file (which was true in your example). The ed script above allows other lines to follow the 2nd pattern and to not copy them to the 2nd file.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Generate fixed length txt file

hi, i am using below query to generate the fixed length txt file. this sql is being called from shell script. This is supposed to be a fixed record file with the below definitions. There must be 2 byte filler after the CAT_ID AND each line should have total of 270 bytes. field ... (1 Reply)
Discussion started by: itzkashi
1 Replies

2. Shell Programming and Scripting

Break one long string into multiple fixed length lines

This is actually a KSH under Unix System Services (Z/OS), but hoping I can get a standard AIX/KSH solution to work... I have a very large, single line file in Windows, that we download via FTP, with the "SITE WRAP" option, into a Z/OS file with an LRECL of 200. This essentially breaks the single... (4 Replies)
Discussion started by: bubbawuzhere
4 Replies

3. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies

4. Shell Programming and Scripting

Splitting fixed length file using awk

Hi, I need to split a fixed length file of 160 characters based on value of a column. Example: ABC 456780001 DGDG SDFSF BCD 444440002 SSSS TTTTT ABC 777750003 HHHH UUUUU THH 888880001 FFFF LLLLLL HHH 999990002 GGGG OOOOO I need to split this file on basis of column from... (7 Replies)
Discussion started by: Neelkanth
7 Replies

5. UNIX for Dummies Questions & Answers

Convert a tab delimited/variable length file to fixed length file

Hi, all. I need to convert a file tab delimited/variable length file in AIX to a fixed lenght file delimited by spaces. This is the input file: 10200002<tab>US$ COM<tab>16/12/2008<tab>2,3775<tab>2,3783 19300978<tab>EURO<tab>16/12/2008<tab>3,28523<tab>3,28657 And this is the expected... (2 Replies)
Discussion started by: Everton_Silveir
2 Replies

6. UNIX for Dummies Questions & Answers

What the command to find out the record length of a fixed length file?

I want to find out the record length of a fixed length file? I forgot the command. Any body know? (9 Replies)
Discussion started by: tranq01
9 Replies

7. Shell Programming and Scripting

How to print string on screen according the fixed length?

Problem: entry_name="joke:hello:yellow:blue:default" print("%d %-12s\t%-10s\t%-5s\n", $i, $entry_name....); I just want to print the output like this index entry value .... 1 joke:hello:y 0 123 567 ellow:blue:d ... (1 Reply)
Discussion started by: a2156z
1 Replies

8. Shell Programming and Scripting

convert fixed length file to CSV

Newbie Looking for a script to convert my input file to delimited text file. Not familier with AWK or shell programing. Below is sample record in my input file and the expected output format. My OS is HPUX 11.23. Thanks in advance for your assistance. tbtbs input file:... (12 Replies)
Discussion started by: tbtbs
12 Replies

9. Shell Programming and Scripting

creating a fixed length output from a variable length input

Is there a command that sets a variable length? I have a input of a variable length field but my output for that field needs to be set to 32 char. Is there such a command? I am on a sun box running ksh Thanks (2 Replies)
Discussion started by: r1500
2 Replies
Login or Register to Ask a Question