Parsing a variable length file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing a variable length file
# 1  
Old 06-01-2009
Parsing a variable length file

Hi I am new to shell scripting. I need to parse a file which contains the header and detail records and split into n of file based on dept ID, for ex.
INPUT FILE:

DEPT ID: 1
EMPNAME: XYZ
EMPAddress: XYZZZ

DEPT ID: 2
EMPNAME: ABC
EMPAddress: ABCD

DEPT ID: 1
EMPNAME: PQR
EMPAddress: PQRS

DEPT ID: 2
EMPNAME: EFG
EMPAddress: EFGH

DEPT ID: 3
EMPNAME: LMN
EMPAddress: LMNO

after processing the above file should be split into three file having records related to only one dept id.

Could you please help me.

DSingh
# 2  
Old 06-01-2009
Code:
sed '/^$/d'  input_file.txt |  awk -F": "  '/DEPT/ { no=$2 ; print $0 >> "File_"$2; next} {print $0 >>"File_"no}'

# 3  
Old 06-01-2009
if you have Python
Code:
#!/usr/bin/env python
data= open("file").read().split("\n\n")
for i in data:
    ind=i.index(":")
    end=i.index("EMPNAME")
    dept_number = i[ind+1:end].strip()
    open("file_"+str(dept_number)+".txt","a").write(i+"\n")

output
Code:
# ./test.py
# more file_1.txt
DEPT ID: 1
EMPNAME: XYZ
EMPAddress: XYZZZ
DEPT ID: 1
EMPNAME: PQR
EMPAddress: PQRS

# more file_2.txt
DEPT ID: 2
EMPNAME: ABC
EMPAddress: ABCD
DEPT ID: 2
EMPNAME: EFG
EMPAddress: EFGH

# more file_3.txt
DEPT ID: 3
EMPNAME: LMN
EMPAddress: LMNO

# 4  
Old 06-01-2009
Another way (based in panyam solution):
Code:
awk '/DEPT ID/{n=$2;getline;print >>"File_"n;getline;print >>"File_"n}' FS=': ' inputfile.txt

# 5  
Old 06-01-2009
The trouble with panyam's solution:

Code:
sed '/^$/d'  input_file.txt |  awk -F": "  '/DEPT/ { no=$2 ; print $0 >> "File_"$2; next} {print $0 >>"File_"no}'

is it assumes that you don't want to keep the blank lines, and it also assumes you want the department number repeated many times in each file (which seems to me like a waste of space).

The trouble with Klashxx's solution:

Code:
awk '/DEPT ID/{n=$2;getline;print >>"File_"n;getline;print >>"File_"n}' FS=': ' inputfile.txt

is it assumes you only have two fields following each department ID in your input files, which doesn't make it very robust.

So for a neater and more robust answer, try:

Code:
awk -F": " '$1=="DEPT ID" {n=$2; next} {print >> "dep" n ".txt"}' inputfile.txt

If, after all, you did want the department number repeating in your individual output files, omit the next instruction, i.e.

Code:
awk -F": " '$1=="DEPT ID" {n=$2} {print >> "dep" n ".txt"}' inputfile.txt

# 6  
Old 06-02-2009
perl:

Code:
my (%hash,$file);
do{
  local $/="\n\n";
  while(<DATA>){
    if(/DEPT ID:\s*([0-9]+)\s*EM/){
      $hash{$1}.=$_;
    }
  }
};

foreach my $item (keys %hash){
  $file=sprintf("file%s.txt",$item);
  open FH,">$file";
  print FH $hash{$item};
  close FH;
}

__DATA__
DEPT ID: 1
EMPNAME: XYZ
EMPAddress: XYZZZ

DEPT ID: 2
EMPNAME: ABC
EMPAddress: ABCD

DEPT ID: 1
EMPNAME: PQR
EMPAddress: PQRS

DEPT ID: 2
EMPNAME: EFG
EMPAddress: EFGH

DEPT ID: 3
EMPNAME: LMN
EMPAddress: LMNO

# 7  
Old 06-02-2009
can some one help me writing shell script (korn shell or bash shell)
Thanks in Advance
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert variable length record to fixed length

Hi Team, I have an issue to split the file which is having special chracter(German Char) using awk command. I have a different length records in a file. I am separating the files based on the length using awk command. The command is working fine if the record is not having any... (7 Replies)
Discussion started by: Anthuvan
7 Replies

2. Shell Programming and Scripting

Adding a delimiter to a variable length file

Hi, I'm new to unix, i have a variable length file like below, 01|Test|Test1|Sample| 02|AA|BB|CC|DD| 03|AAA|BBB|CCC|DDD|EEE|RRR|TTT|SSS|YYY| I need to make this as a fixed length file. Assume that i have 10 columns in the DAT file. for ex: the first 01 record is having 4cols -... (8 Replies)
Discussion started by: Mohankumar Venu
8 Replies

3. Shell Programming and Scripting

changing a variable length text to a fixed length

Hi, Can anyone help with a effective solution ? I need to change a variable length text field (between 1 - 18 characters) to a fixed length text of 18 characters with the unused portion, at the end, filled with spaces. The text field is actually field 10 of a .csv file however I could cut... (7 Replies)
Discussion started by: dc18
7 Replies

4. Shell Programming and Scripting

Split variable length and variable format CSV file

Dear all, I have basic knowledge of Unix script and her I am trying to process variable length and variable format CSV file. The file length will depend on the numbers of Earnings/Deductions/Direct Deposits. And The format will depend on whether it is Earnings/Deductions or Direct Deposits... (2 Replies)
Discussion started by: chechun
2 Replies

5. Shell Programming and Scripting

Make variable length record a fixed length

Very, very new to unix scripting and have a unique situation. I have a file of records that contain 3 records types: (H)eader Records (D)etail Records (T)railer Records The Detail records are 82 bytes in length which is perfect. The Header and Trailer records sometimes are 82 bytes in... (3 Replies)
Discussion started by: jclanc8
3 Replies

6. UNIX for Dummies Questions & Answers

Convert a tab delimited/variable length file to fixed length file

Hi, all. I need to convert a file tab delimited/variable length file in AIX to a fixed lenght file delimited by spaces. This is the input file: 10200002<tab>US$ COM<tab>16/12/2008<tab>2,3775<tab>2,3783 19300978<tab>EURO<tab>16/12/2008<tab>3,28523<tab>3,28657 And this is the expected... (2 Replies)
Discussion started by: Everton_Silveir
2 Replies

7. UNIX for Dummies Questions & Answers

Creating a file in variable length format

Hi all, Does anyone know a technique for creating a download file in variable length format? I have looked around but haven't found any resources on this (or, maybe I'm not sure what to Google for :) ) Thanks in advance! (2 Replies)
Discussion started by: Sarahb29
2 Replies

8. UNIX for Dummies Questions & Answers

Parsing a variable length record

I need to pick a field out of a variable record - the field is always found 4 fields after a certain text string, but it can be on any line of the record and in any position across the record on a line. I have had no luck through any of the Unix editors being able to cut a field that isn't always... (17 Replies)
Discussion started by: Barb
17 Replies

9. Shell Programming and Scripting

creating a fixed length output from a variable length input

Is there a command that sets a variable length? I have a input of a variable length field but my output for that field needs to be set to 32 char. Is there such a command? I am on a sun box running ksh Thanks (2 Replies)
Discussion started by: r1500
2 Replies

10. Shell Programming and Scripting

Parsing data and retaining the full length of variable

Here's is an example of what I want to do: var1="Horse " var2="Cat " var3="Fish " for animals in "$var1" "$var2" "$var3" do set $animals pet=$1 ## Ok, now I want to get the values of $pet, but ## I want to retain the full length it was... (3 Replies)
Discussion started by: app4dxh
3 Replies
Login or Register to Ask a Question