Sponsored Content
Top Forums Shell Programming and Scripting How to fix line breaks format text for huge files? Post 302590158 by binlib on Saturday 14th of January 2012 11:28:33 AM
Old 01-14-2012
So much interest in this topic, let's try another way for performance since his file is huge. This will edit the file in place and should be blazing fast.
Code:
linux$ cat t.c
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>

int main(int ac, char *av[])
{
  struct stat st;
  char *m, *e;
  int fd;

  fd = open(av[1], O_RDWR); 
  if (fstat(fd, &st) < 0) { perror("fstat"); exit(1); }
  m = mmap(0, st.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
  if (m == 0 || m == (void *)-1) { perror("mmap"); exit(1); }
  e = m + st.st_size;

  while (m < e) {
    if ((m = strchr(m, '\n')) == 0) break;
    if (*++m == 'T') break; 
    if (*m != 'D') m[-1] = ' ';
  }

  return 0;
}

linux$ gcc t.c

linux$ cat t.dat
HEADER474687
D1356jkl ugbliuybikb 879870
898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh
kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

linux$ cp t.dat t1.dat

linux$ ./a.out t1.dat

linux$ cat t1.dat
HEADER474687
D1356jkl ugbliuybikb 879870 898976098 9687680
D77656757 uhgliug liygoiygig
D98679hjh kjbgihguygfu ugliyh kbygfluy9809
D8796870 kjlhuigiyig
TRAILER0008

linux$ cmp -l t.dat t1.dat
 41  12  40
118  12  40

These 2 Users Gave Thanks to binlib For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to remove FIRST Line of huge text file on Solaris

i need help..!!!! i have one big text file estimate data file size 50 - 100GB with 70 Mega Rows. on OS SUN Solaris version 8 How i can remove first line of the text file. Please suggest me for solutions. Thank you very much in advance:) (5 Replies)
Discussion started by: madoatz
5 Replies

2. Shell Programming and Scripting

Fix the breaks

The file FTP'd got few breaks and the data looks like: ABCTOM NYMANAGER ABCDAVE NJ PROGRAMMER ABCJIM CTTECHLEAD ABCPETERCA HR and i want the output like: ABCTOM NYMANAGER ABCDAVE NJPROGRAMMER ABCJIM CTTECHLEAD ABCPETERCAHR can you please help me in writing the shell... (8 Replies)
Discussion started by: rlmadhav
8 Replies

3. UNIX for Advanced & Expert Users

Best way to search for patterns in huge text files

I have the following situation: a text file with 50000 string patterns: abc2344536 gvk6575556 klo6575556 .... and 3 text files each with more than 1 million lines: ... 000000 abc2344536 46575 0000 000000 abc2344536 46575 4444 000000 abc2344555 46575 1234 ... I... (8 Replies)
Discussion started by: andy2000
8 Replies

4. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

5. UNIX for Dummies Questions & Answers

VIM search and replace with line breaks in both the target and replacement text

Hi, Ive spent ages trying to find an explanation for how to do this on the web, but now feel like I'm :wall: I would like to change each occurence (there are many within my script) of the following: to in Vim. I know how to search and replace when it is just single lines... (2 Replies)
Discussion started by: blueade7
2 Replies

6. Shell Programming and Scripting

Format & Compare two huge CSV files

I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs. FILE1 ----- 2001,"John",25,19901130,21211.41,Unix Forum... (3 Replies)
Discussion started by: Sheel
3 Replies

7. Windows & DOS: Issues & Discussions

Convert UNIX text file in Windows to recognize line breaks

Hmmm I think I found the correct subforum to ask my question... I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this?... (1 Reply)
Discussion started by: frys_hp
1 Replies

8. UNIX for Dummies Questions & Answers

Convert UNIX text file in Windows to recognize line breaks

Hi all, I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this? Or just reinsert thousands of line breaks again :eek: ? (2 Replies)
Discussion started by: frys_hp
2 Replies

9. UNIX for Dummies Questions & Answers

Page breaks and line breaks

Hi All, Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement . Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks . We have... (4 Replies)
Discussion started by: Ayaskant
4 Replies

10. Shell Programming and Scripting

How to add line breaks to perl command with large text in single quotes?

Below code extracts multiple field values from XML into array and prints all in one line. perl -nle '@r=/(?: jndiName| authDataAlias| value| minConnections| maxConnections| connectionTimeout| name)="(+)/g and print join ",",$ENV{tIPnSCOPE},$ENV{pr ovider},$ENV{impClassName},@r' server.xml ... (4 Replies)
Discussion started by: kchinnam
4 Replies
runacct(1M)															       runacct(1M)

NAME
runacct - run daily accounting SYNOPSIS
[mmdd[state]] DESCRIPTION
runacct is the main daily accounting shell procedure. It is normally initiated via cron(1M). runacct processes connect, fee, disk, and process accounting files. It also prepares summary files for prdaily or billing purposes. runacct takes care not to damage active accounting files or summary files in the event of errors. It records its progress by writing descriptive diagnostic messages into When an error is detected, a message is written to mail (see mail(1), mailx(1), or elm(1)) is sent to and and runacct terminates. runacct uses a series of lock files to protect against re-invocation. The files and are used to prevent simultaneous invocation, and is used to prevent more than one invocation per day. runacct breaks its processing into separate, restartable states using to remember the last state completed. It accomplishes this by writ- ing the state name into runacct then looks in to see what it has done and to determine what to process next. states are executed in the following order: Move active accounting files into working files. Verify integrity of file, correcting date changes if necessary. Produce connect session records in format. Convert process accounting records into format. Merge the connect and process accounting records. Convert output of chargefee into format and merge with connect and process accounting records. Merge disk accounting records with connect, process, and fee accounting records. Merge the daily total accounting records in with the summary total accounting records in Produce command summaries. Any installation-dependent accounting programs can be included here. Cleanup temporary files and exit. To restart runacct after a failure, first check the file for diagnostics, then fix up any corrupted data files such as or The files and file must be removed before runacct can be restarted. The argument mmdd is necessary if runacct is being restarted, and specifies the month and day for which runacct will rerun the accounting. Entry point for processing is based on the contents of to override this, include the desired state on the command line to designate where processing should begin. EXAMPLES
To start runacct. To restart runacct. To restart runacct at a specific state. WARNINGS
Normally it is not a good idea to restart runacct in its state. Run manually, then restart via: If runacct failed in its state, remove the last file because it will not be complete. FILES
SEE ALSO
mail(1), acct(1M), acctcms(1M), acctcom(1M), acctcon(1M), acctmerg(1M), acctprc(1M), acctsh(1M), cron(1M), fwtmp(1M), acct(2), acct(4), utmp(4). STANDARDS CONFORMANCE
runacct(1M)
All times are GMT -4. The time now is 12:43 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy