Visit Our UNIX and Linux User Community


Removing inserted newlines from a fileld of fixed width file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing inserted newlines from a fileld of fixed width file.
# 1  
Old 08-18-2009
Error Removing inserted newlines from a fileld of fixed width file.

Hi champs!

I have a fixed width file in which the records appear like this

Code:
11111 <fixed spaces such as 6> description for 11111 <fixed spaces such as 6> some more field to the record of 11111
22222 <fixed spaces such as 6> description for 22222 <fixed spaces such as 6> some more field to the record of 22222
33333 <fixed spaces such as 6> description 
for 33333 <fixed spaces such as 6> some more field to the record of 33333
44444 <fixed spaces such as 6> description for 44444 <fixed spaces such as 6> some more field to the record of 44444

As you see, the record for 33333 is split into two records because of newline inserted in description of 33333. I want these extraneous newlines from description field to be removed for records where ever they appear in the file.
Clues can be : check the file for length 11 -32 for each record and if newline is present strip it off.
Any other solution is welcome too.
I want the output to be :

Code:
11111 <fixed spaces such as 6> description for 11111 <fixed spaces such as 6> some more field to the record of 22222
22222 <fixed spaces such as 6> description for 22222 <fixed spaces such as 6> some more field to the record of 22222
33333 <fixed spaces such as 6> description for 22222 <fixed spaces such as 6> some more field to the record of 33333
44444 <fixed spaces such as 6> description for 44444 <fixed spaces such as 6> some more field to the record of 44444


- it is not fixed that line break will appear after 'description' only..it can appear anywhere in the second field.But it is sure that it will appear in second field only, incase it appears.

- This is just the sample record for understanding, code should not be dependent on it.The code can be dependent on positioning if required.
It is a fixed width file that means each filed is identified by length in the record.


Please let me know if you need more clarification.

Last edited by enigma_1; 08-18-2009 at 06:55 PM.. Reason: code tags, PLEASE!
# 2  
Old 08-18-2009
To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags [code] and [/code] by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

---------- Post updated at 05:54 PM ---------- Previous update was at 05:38 PM ----------

something to start with...

'len' is a known/expected length of ALL the records (assuming they are of the same length) - defaulted to '73'.

Assumption: there's only ONE extra new-line per 'broken' record.
nawk -f enigma.awk myFile
OR
nawk -v len=63 -f enigma.awk myFile

enigma.awk:
Code:
BEGIN {
  len=(!len)?73:len
}
length < len {
   if (length(s)) { print s OFS $0;s=""}
   else s=$0
   next
}
1

# 3  
Old 08-19-2009
Thanks vgersh !!

The code you provided worked for me for the records broken into two.
But I have some more problems. Hope you can help.
As ytou mentioned in your assumption that record is divided into two records only.
Unfortunately In my file I have just one record which is divided into three records.

Sample:
Code:
33333 <fixed spaces such as 6> description 
for 
33333 <fixed spaces such as 6> some more field to the record of 33333

which needs to be :

Code:
33333 <fixed spaces such as 6> description for 33333 <fixed spaces such as 6> some more field to the record of 33333

Can we have some modification to the enigma.awk program to take care of record break to three records?? If I can ask for more, Can we have the code to take care of any level of record break heirarchy for each record?
I guess you need some identification for each records start.

In my file each new record starts from column(length)= 16. If any record starts from before length 16, it is continuation of previous record.

Thank you once again!
# 4  
Old 08-19-2009
enigma.awk:
Code:
BEGIN {
  len=(!len)?73:len
}
length < len {
   if (length(s)) { s=s OFS $0}
   else s=$0
   if (length(s) == len) { print s; s=""}
   next
}
1

# 5  
Old 08-19-2009
What is the length of a record?

Regards
# 6  
Old 08-19-2009
I might be wrong, but isn't this the very type of problems the "fmt" simple optimal formatter tool was created for?

"fmt -w <your desired line length here>" should do the trick.

I hope this helps.

bakunin
# 7  
Old 08-19-2009
Quote:
Originally Posted by bakunin
I might be wrong, but isn't this the very type of problems the "fmt" simple optimal formatter tool was created for?

"fmt -w <your desired line length here>" should do the trick.

I hope this helps.

bakunin
good tip - forgot about fmt - thanks.

Previous Thread | Next Thread
Test Your Knowledge in Computers #615
Difficulty: Medium
You can specify multi-line strings using triple quotes in Python.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Alter Fixed Width File

Thank u so much .Its working fine as expected. ---------- Post updated at 03:41 PM ---------- Previous update was at 01:46 PM ---------- I need one more help. I have another file(fixed length) that will get negative value (ex:-00000000003000) in postion (98 - 112) then i have to... (6 Replies)
Discussion started by: vinus
6 Replies

2. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies

3. Shell Programming and Scripting

Removing duplicates in fixed width file which has multiple key columns

Hi All , I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file . File has 8 columns. Key columns are col1 and col2. Col1 has the length of 8 col 2 has the length of 3. ... (5 Replies)
Discussion started by: saj
5 Replies

4. Shell Programming and Scripting

Comparing two fixed width file

Hi Guys I am checking the treads to get the answer but i am not able to get the answer for my question. I have two files. First file is a pattern file and the second file is the file i want to search in it. Output will be the lines from file2. File1: P2797f12af 44751228... (10 Replies)
Discussion started by: anshul_er
10 Replies

5. Shell Programming and Scripting

Fixed-Width file from Oracle

Hi All, I have created a script which generates FIXED-WIDTH file by executing Oracle query. SELECT RPAD(NVL(col1,CHR(9)),20)||NVL(col2,CHR(9))||NVL(col3,CHR(9) FROM XYZ It generates the data file with proper alignment. But if same file i transfer to windows server or Mainframe... (5 Replies)
Discussion started by: Amit.Sagpariya
5 Replies

6. Shell Programming and Scripting

Removing \n within a fixed width record

I am trying to remove a line feed (\n) within a fixed width record. I tried the tr -d \n' command, but it also removes the record delimiter. Is there a way to remove the line feed without removing the record delimiter? (10 Replies)
Discussion started by: CKT_newbie88
10 Replies

7. UNIX Desktop Questions & Answers

Help with Fixed width File Parsing

I am trying to parse a Fixed width file with data as below. I am trying to assign column values from each record to variables. When I parse the data, the spaces in all coumns are dropped. I would like to retain the spaces as part of the dat stored in the variables. Any help is appreciated. I... (4 Replies)
Discussion started by: sate911
4 Replies

8. Shell Programming and Scripting

Changing particular field in fixed width file

I have a fixed width file and i need to change 36th field to "G" in for about random 20 records? How can I do it? (4 Replies)
Discussion started by: dsravan
4 Replies

9. Shell Programming and Scripting

adding delimiter to a fixed width file

Hi , I have a file : CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74100.00 CSCH74100.00 I have to put a delimiter( say comma) in between after 6th character: CSCH74,000.00 CSCH74,000.00 CSCH74,100.00 (2 Replies)
Discussion started by: sumeet
2 Replies

10. UNIX for Dummies Questions & Answers

Fixed Width file using AWK

I am using the following command at the Unix prompt to make my 'infile' into a fixed width file of 100 characters. awk '{printf "%-100s\n",$0}' infile > outfile However, there are some records with a special character "" These records are using 3 characters in place of one and my record... (2 Replies)
Discussion started by: alok.benjwal
2 Replies

Featured Tech Videos