Help- counting delimiter in a huge file and split data into 2 files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help- counting delimiter in a huge file and split data into 2 files
# 8  
Old 03-01-2011
Quote:
Originally Posted by Corona688
I suspect your system isn't linux, because linux generally has [gn]awk, only gawk, and nothing but gawk. Glad you got it working.
IT's SunOS. Today i found out the nawk also gave bad result. many records with correct counts of 291. and the rejected file showed that they had only 198 semi-colon in them. those rejected records also got truncate in middle. the original record showed more data in the row than the rejected row???
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ... (1 Reply)
Discussion started by: trymega
1 Replies

2. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

3. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

4. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

5. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

6. Shell Programming and Scripting

Implement in one line sed or awk having no delimiter and file size is huge

I have file which contains around 5000 lines. The lines are fixed legth but having no delimiter.Each line line contains nearly 3000 characters. I want to delete the lines a> if it starts with 1 and if 576th postion is a digit i,e 0-9 or b> if it starts with 0 or 9(i,e header and footer) ... (4 Replies)
Discussion started by: millan
4 Replies

7. Shell Programming and Scripting

renaming files using split with a delimiter

I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies

8. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

9. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies
Login or Register to Ask a Question
AMPLOT(8)						      System Manager's Manual							 AMPLOT(8)

NAME
amplot - visualize the behavior of Amanda SYNOPSIS
amplot [ -c ] [ -e ] [ -g ] [ -l ] [ -p ] [ -t T ] amdump_files DESCRIPTION
Amplot reads an amdump output file that Amanda generates each run (e.g. amdump.1) and translates the information into a picture format that may be used to determine how your installation is doing and if any parameters need to be changed. Amplot also prints out amdump lines that it either does not understand or knows to be warning or error lines and a summary of the start, end and total time for each backup image. Amplot is a shell script that executes an awk program (amplot.awk) to scan the amdump output file. It then executes a gnuplot program (amplot.g) to generate the graph. The awk program is written in an enhanced version of awk, such as GNU awk (gawk version 2.15 or later) or nawk. During execution, amplot generates a few temporary files that gnuplot uses. These files are deleted at the end of execution. See the amanda(8) man page for more details about Amanda. OPTIONS
-c Compress amdump_files after plotting. -e Extend the X (time) axis if needed. -g Direct gnuplot output directly to the X11 display (default). -p Direct postscript output to file YYYYMMDD.ps (opposite of -g). -l Generate landscape oriented output. -t T Set the right edge of the plot to be T hours. The amdump_files may be in various compressed formats (compress, gzip, pact, compact). INTERPRETATION
The figure is divided into a number of regions. There are titles on the top that show important statistical information about the configu- ration and from this execution of amdump. In the figure, the X axis is time, with 0 being the moment amdump was started. The Y axis is divided into 5 regions: QUEUES: How many backups have not been started, how many are waiting on space in the holding disk and how many have been transferred successfully to tape. %BANDWIDTH: Percentage of allowed network bandwidth in use. HOLDING DISK: The higher line depicts space allocated on the holding disk to backups in progress and completed backups waiting to be written to tape. The lower line depicts the fraction of the holding disk containing completed backups waiting to be written to tape including the file currently being written to tape. The scale is percentage of the holding disk. TAPE: Tape drive usage. %DUMPERS: Percentage of active dumpers. The idle period at the left of the graph is time amdump is asking the machines how much data they are going to dump. This process can take a while if hosts are down or it takes them a long time to generate estimates. AUTHOR
Olafur Gudmundsson ogud@tis.com Trusted Information Systems formerly at University of Maryland, College Park BUGS
Reports lines it does not recognize, mainly error cases but some are legitimate lines the program needs to be taught about. SEE ALSO
amanda(8), amdump(8), gawk(1), nawk(1), awk(1), gnuplot(1), sh(1), compress(1), gzip(1) 4th Berkeley Distribution AMPLOT(8)