How to fix line breaks format text for huge files?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to fix line breaks format text for huge files?
# 15  
Old 01-13-2012
you can give a try with this
Code:
awk '{n="\n"}/^HEADER/{print;delete A[D];next}/^D/{p=(!A[D]++)?z:n;printf p $0;next}{y=(/^TRAILER/)?n:z;printf y $0 y}' tst

This User Gave Thanks to ctsgnb For This Post:
# 16  
Old 01-14-2012
Alternatively:
Code:
awk '/^[THD]/{print p;p=$0;next}{p=p FS $0}END{print p}' infile

or
Code:
awk '/^[THD]/{print RS}1' ORS= infile

These 2 Users Gave Thanks to Scrutinizer For This Post:
# 17  
Old 01-14-2012
So much interest in this topic, let's try another way for performance since his file is huge. This will edit the file in place and should be blazing fast.
Code:
linux$ cat t.c
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>

int main(int ac, char *av[])
{
  struct stat st;
  char *m, *e;
  int fd;

  fd = open(av[1], O_RDWR); 
  if (fstat(fd, &st) < 0) { perror("fstat"); exit(1); }
  m = mmap(0, st.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
  if (m == 0 || m == (void *)-1) { perror("mmap"); exit(1); }
  e = m + st.st_size;

  while (m < e) {
    if ((m = strchr(m, '\n')) == 0) break;
    if (*++m == 'T') break; 
    if (*m != 'D') m[-1] = ' ';
  }

  return 0;
}

linux$ gcc t.c

linux$ cat t.dat
HEADER474687
D1356jkl ugbliuybikb 879870
898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh
kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

linux$ cp t.dat t1.dat

linux$ ./a.out t1.dat

linux$ cat t1.dat
HEADER474687
D1356jkl ugbliuybikb 879870 898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

linux$ cmp -l t.dat t1.dat
 41  12  40
118  12  40

These 2 Users Gave Thanks to binlib For This Post:
# 18  
Old 01-16-2012
Withdrawn - didn't see the 2nd page of responses before submitting.
# 19  
Old 01-16-2012
try this

Code:
sed -nr ' /(HEADER|TRAILER)/ !{
/^D/ {
x;s/\n//g;p 
          }
/^D/ ! {
H
            }
              }
 $  {
       x;  p
     }  '      filename

# 20  
Old 01-16-2012
hi.. thank you for pointing out the error and sorry for the delay...
try this out..

Input:
Code:
 
HEADER474687
D1356jkl ugbliuybikb 879870
898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh
kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

Code:
Code:
 
awk '{
if(NR == 1)
 printf "%s", $0;
else if(NR != 1) 
{
 if($0 !~ /^TRAILER/)
 {
  if($0 ~ /^D/)
  {
   printf "\n%s", $0 ; next 
  }
  else
  {
   printf "%s", $0; next
  }
 }
 else
 {
  last=$0;
 }
}
}
END {
printf "\n%s\n", last;
}' f5

Output:

Code:
HEADER474687
D1356jkl ugbliuybikb 879870 898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

Regards,
A!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to add line breaks to perl command with large text in single quotes?

Below code extracts multiple field values from XML into array and prints all in one line. perl -nle '@r=/(?: jndiName| authDataAlias| value| minConnections| maxConnections| connectionTimeout| name)="(+)/g and print join ",",$ENV{tIPnSCOPE},$ENV{pr ovider},$ENV{impClassName},@r' server.xml ... (4 Replies)
Discussion started by: kchinnam
4 Replies

2. UNIX for Dummies Questions & Answers

Page breaks and line breaks

Hi All, Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement . Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks . We have... (4 Replies)
Discussion started by: Ayaskant
4 Replies

3. UNIX for Dummies Questions & Answers

Convert UNIX text file in Windows to recognize line breaks

Hi all, I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this? Or just reinsert thousands of line breaks again :eek: ? (2 Replies)
Discussion started by: frys_hp
2 Replies

4. Windows & DOS: Issues & Discussions

Convert UNIX text file in Windows to recognize line breaks

Hmmm I think I found the correct subforum to ask my question... I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this?... (1 Reply)
Discussion started by: frys_hp
1 Replies

5. Shell Programming and Scripting

Format & Compare two huge CSV files

I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs. FILE1 ----- 2001,"John",25,19901130,21211.41,Unix Forum... (3 Replies)
Discussion started by: Sheel
3 Replies

6. UNIX for Dummies Questions & Answers

VIM search and replace with line breaks in both the target and replacement text

Hi, Ive spent ages trying to find an explanation for how to do this on the web, but now feel like I'm :wall: I would like to change each occurence (there are many within my script) of the following: to in Vim. I know how to search and replace when it is just single lines... (2 Replies)
Discussion started by: blueade7
2 Replies

7. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

8. UNIX for Advanced & Expert Users

Best way to search for patterns in huge text files

I have the following situation: a text file with 50000 string patterns: abc2344536 gvk6575556 klo6575556 .... and 3 text files each with more than 1 million lines: ... 000000 abc2344536 46575 0000 000000 abc2344536 46575 4444 000000 abc2344555 46575 1234 ... I... (8 Replies)
Discussion started by: andy2000
8 Replies

9. Shell Programming and Scripting

Fix the breaks

The file FTP'd got few breaks and the data looks like: ABCTOM NYMANAGER ABCDAVE NJ PROGRAMMER ABCJIM CTTECHLEAD ABCPETERCA HR and i want the output like: ABCTOM NYMANAGER ABCDAVE NJPROGRAMMER ABCJIM CTTECHLEAD ABCPETERCAHR can you please help me in writing the shell... (8 Replies)
Discussion started by: rlmadhav
8 Replies

10. UNIX for Dummies Questions & Answers

How to remove FIRST Line of huge text file on Solaris

i need help..!!!! i have one big text file estimate data file size 50 - 100GB with 70 Mega Rows. on OS SUN Solaris version 8 How i can remove first line of the text file. Please suggest me for solutions. Thank you very much in advance:) (5 Replies)
Discussion started by: madoatz
5 Replies
Login or Register to Ask a Question