Sponsored Content
Top Forums Shell Programming and Scripting Fixed Width file creation from csv Post 302922324 by DGPickett on Thursday 23rd of October 2014 05:20:50 PM
Old 10-23-2014
I wrote a C tool for this and similar tasks (still needs tr to delete .), but use pipes! Option it to pad with 0, tab on ',' and right justify the numbers. Its original use was to align tab separated columns in bulk data (assuming fixed pitch font):
Code:
$ cat mysrc/autotab.c
 
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
static  FILE    *tmp ;          /* temp file */
static  char    *just = "l" ;   /* output column justification */
static  char    *osep = "  " ;  /* output column sep */
static  char    j ;             /* current justification */
static  char    sav[4096] ;     /* output column store for justification */
static  char    savl[65536];/* output line store */
static  int     c ;             /* character read */
static  int     cl = 0 ;        /* current column length */
static  int     col = 0 ;       /* current column # */
static  int     fs = 0 ;        /* possibly embedded spaces found */
static  int     gen_hdr = 0 ;   /* generate header state */
static  int     i ;             /* utility int */
static  int     no ;            /* narrow, overlap final column */
static  int     isep = '\t' ;   /* input column sep */
static  int     jlen = 1 ;      /* output column justification */
static  int     l[4096] ;       /* array of column widths */
static  int     ll = 0 ;        /* output line length */
static  int     maxcol = 0 ;    /* max column # */
int main( int argc, char **argv ){
        for ( i = 1 ; i < argc ; i++ ){
                if ( !strcmp( argv[i], "-is" ) && ( i + 1 ) < argc ){
                        isep = argv[++i][0] ;
                        continue ;
                 }
                if ( !strcmp( argv[i], "-os" ) && ( i + 1 ) < argc ){
                        osep = argv[++i] ;
                        continue ;
                 }
                if ( !strcmp( argv[i], "-no" )){
                        no = 1 ;
                        continue ;
                 }
                if ( !strcmp( argv[i], "-j" ) && ( i + 1 ) < argc ){
                        just = argv[++i] ;
                        jlen = strlen( just );
                        continue ;
                 }
                if ( !strcmp( argv[i], "-gh" ) ){
                        gen_hdr = 1 ;
                        continue ;
                 }
                fprintf( stderr,
"\n"
"Usage: autotab [ -is <i_sep> ] [ -os <o_sep> ] [ -gh ] [ -no ] [ -j <just> ]\n"
"\n"
"Scans input as columns defined by <i_sep> (default tab), measuring maximum\n"
"column width without blank padding and saving the input.  Lines with no\n"
"<i_sep> are not measured.  If -no is present (narrow, overlapping), the\n"
"characters between the last <i_sep> on a line and the line feed are not\n"
"measured.  (The -no option is only useful with left justification.)\n"
"After reading EOF, the saved input is printed, padded to the measured\n"
"column width and separated by the <o_sep> string (default 2 spaces) with\n"
"empty right side columns, blanks and carriage returns suppressed.\n"
"If -j is present, the characters of <just> define the justification of each\n"
"column with the same relative offset:\n"
" r for right, c for centered, and anything else means left.\n"
"If -gh is present, the saved input is prefixed by a numbered column header,\n"
"which is padded and aligned like the data.\n"
"The size limits are: %d measured columns, output line %d characters\n"
"and right or center justified column data width %d characters.\n"
"\n",
                        sizeof( l )/sizeof(int),
                        sizeof( savl ),
                        sizeof( sav ));
                exit( 1 );
         }
        if ( NULL == ( tmp = tmpfile() )){
                perror( "tmpfile()" );
                exit( 1 );
         }
        memset( (char*)l, 0, sizeof( l ) );
        do {
                switch( c = getchar() ){
                case EOF:
                        if ( ferror( stdin ) ){
                                perror( "stdin" );
                                exit( 1 );
                         }
                        continue ;      /* Out of loop */
                case '\n':
                        if ( no ){
                                col = 0 ;
                                cl = 0 ;
                                fs = 0 ;
                                break ;
                         }
                        /* Intentional Fall Through */
                case '\f':
                        if ( cl && col ){
                                if ( col == ( sizeof( l ) / sizeof( int ) )){
                                        fprintf( stderr,
                                                 "Too many columns!\n" );
                                        exit( 1 );
                                 }
                                if ( cl > l[col] ){
                                        l[col++] = cl ;
                                 }
                         }
                        if ( col > maxcol ){
                                maxcol = col ;
                         }
                        col = 0 ;
                        cl = 0 ;
                        fs = 0 ;
                        break ;
                case ' ':
                        if ( cl ){
                                fs++ ;
                         }
                case '\r':
                        continue ;
                default:
                        if ( c == isep ){
                                if ( cl ){
                                        if ( col == ( sizeof( l )
                                                        / sizeof( int ) )){
                                                fprintf( stderr,
                                                         "Too many columns!\n"
                                                        );
                                                exit( 1 );
                                         }
                                        if ( cl > l[col] ){
                                                l[col] = cl ;
                                         }
                                 }
                                col++ ;
                                cl = 0 ;
                                fs = 0 ;
                                break ;
                         }
                        cl++ ;
                        cl += fs ;
                        while ( fs ){
                                fs-- ;
                                if ( EOF == putc( ' ', tmp )){
                                        perror( "putc(tmp)" );
                                        exit( 1 );
                                 }
                         }
                        break ;
                 }
                if ( EOF == putc( c, tmp )){
                        perror( "putc(tmp)" );
                        exit( 1 );
                 }
        } while ( c != EOF );
        rewind( tmp );
        if ( gen_hdr ){
                col = 0 ;
                do {
                        if ( 0 > ( cl = printf( "Col. %d", col + 1 ))){
                                if ( ferror( stdout )){
                                        perror( "stdout" );
                                        exit( 1 );
                                 }
                                exit( 0 );
                         }
                        if ( cl > l[col] ){
                                l[col] = cl ;
                         } else while ( cl++ < l[col] ){
                                if ( EOF == putchar( ' ' )){
                                        if ( ferror( stdout )){
                                                perror( "stdout" );
                                                exit( 1 );
                                         }
                                        exit( 0 );
                                 }
                         }
                        if ( ++col == maxcol ){
                                break ;
                         }
                        if ( EOF == fputs( osep, stdout )){
                                if ( ferror( stdout )){
                                        perror( "stdout" );
                                        exit( 1 );
                                 }
                                exit( 0 );
                         }
                 } while ( 1 );
                if ( EOF == putchar( '\n' )){
                        if ( ferror( stdout ) ){
                                perror( "stdout" );
                                exit( 1 );
                         }
                        exit( 0 );
                 }
                cl = col = 0 ;
         }
        j = *just ;
        do {
                switch ( c = getc( tmp )){
                case EOF:
                        if ( ferror( tmp )){
                                perror( "getc(tmp)" );
                                exit( 1 );
                         }
                        if ( !ll && !cl ){
                                exit( 0 );
                         }
                        c = '\n' ;
                        /* Intentional fall through for EOF as linefeed */
                case '\f':
                case '\n':
                        if ( cl ){
                                if ( col ){
                                        fs = l[col] - cl ;
                                 } else {
                                        fs = 0 ;
                                 }
                                switch ( j ){
                                case 'c':
                                        fs >>= 1 ;
                                case 'r':
                                        if ( ll > ( sizeof( savl ) - fs - cl )){
                                                fputs(
"Output line too long!\n",                              stderr );
                                                exit( -1 );
                                        }
                                        ll += sprintf( savl + ll,
                                                "%*s%.*s",
                                                fs,
                                                "",
                                                cl,
                                                sav );
                                        break ;
                                 }
                         }
                        while ( savl[--ll] == ' '
                             || savl[ll] == '\t' ){
                                /* nothing */
                         }
                        if ( 0 > printf( "%.*s%c", ++ll, savl, c )){
                                if ( ferror( stdout )){
                                        perror( "stdout" );
                                 }
                                exit( 1 );
                         }
                        ll = 0 ;
                        col = 0 ;
                        cl = 0 ;
                        fs = 0 ;
                        j = *just ;
                        break ;
                default:
                        if ( c == isep ){
                                fs = l[col] - cl ;
                                if ( ll >
                                  ( sizeof( savl ) - fs - cl - strlen( osep ))){
                                        fputs(
"Output line too long!\n",                      stderr );
                                        exit( 1 );
                                }
                                switch ( j ){
                                case 'c':
                                        ll += sprintf( savl + ll,
                                                "%*s%.*s%*s",
                                                fs >> 1,
                                                "",
                                                cl,
                                                sav,
                                                fs - ( fs >> 1 ),
                                                "" );
                                        break ;
                                case 'r':
                                        ll += sprintf( savl + ll,
                                                "%*s%.*s",
                                                fs,
                                                "",
                                                cl,
                                                sav );
                                        break ;
                                default:
                                        ll += sprintf( savl + ll, "%*s", fs, ""
                                                );
                                        break ;
                                 }
                                ll += sprintf( savl + ll, "%s", osep );
                                if ( ++col < jlen ){
                                        j = just[col] ;
                                 } else {
                                        j = 'l' ;
                                 }
                                fs = 0 ;
                                cl = 0 ;
                                continue ;
                         }
                        if ( j == 'r' || j == 'c' ){
                                if ( cl >= sizeof( sav )){
                                        fprintf( stderr,
"\nFatal: Column %d too wide.\n",
                                                ++col );
                                        exit( 1 );
                                 }
                                sav[cl++] = c ;
                                continue ;
                         }
                        if ( ll >= sizeof( savl )){
                                fprintf( stderr, "Output line too long!\n" );
                                exit( 1 );
                         }
                        cl++ ;
                        savl[ll++] = c ;
                        break ;
                 }
         } while ( c != EOF );
        exit( 0 );
 }

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Fixed Width file using AWK

I am using the following command at the Unix prompt to make my 'infile' into a fixed width file of 100 characters. awk '{printf "%-100s\n",$0}' infile > outfile However, there are some records with a special character "©" These records are using 3 characters in place of one and my record... (2 Replies)
Discussion started by: alok.benjwal
2 Replies

2. UNIX for Advanced & Expert Users

Converting field into fixed width csv

Hi I have a file having record as - 1,aaa,a123,a I need this converted to as 2nd col to 5 chars wide & 3rd col to 6chars wide such as - 1,aaa ,a123 ,a How we could achieve this? Thx in advance. (1 Reply)
Discussion started by: videsh77
1 Replies

3. UNIX Desktop Questions & Answers

Help with Fixed width File Parsing

I am trying to parse a Fixed width file with data as below. I am trying to assign column values from each record to variables. When I parse the data, the spaces in all coumns are dropped. I would like to retain the spaces as part of the dat stored in the variables. Any help is appreciated. I... (4 Replies)
Discussion started by: sate911
4 Replies

4. Shell Programming and Scripting

Fixed-Width file from Oracle

Hi All, I have created a script which generates FIXED-WIDTH file by executing Oracle query. SELECT RPAD(NVL(col1,CHR(9)),20)||NVL(col2,CHR(9))||NVL(col3,CHR(9) FROM XYZ It generates the data file with proper alignment. But if same file i transfer to windows server or Mainframe... (5 Replies)
Discussion started by: Amit.Sagpariya
5 Replies

5. Shell Programming and Scripting

Manupulating Records in a fixed width file

I am trying to determine what would be a fast and simple way to manipulate data that comes in a fixed width format. This data has 6 segments within a record. Each record needs to written out with a header and the 6 segments. Based on the value in column #6 the fields will be defined accordingly.... (4 Replies)
Discussion started by: Muga801
4 Replies

6. UNIX for Dummies Questions & Answers

cleaning up spaces from fixed width file while converting to csv file

Open to a sed/awk/or perl alternative so that i can stick command into my bash script. This is a problem I resolve using a combination of cut commands - but that is getting convoluted. So would really appreciate it if someone could provide a better solution which basically replaces all... (3 Replies)
Discussion started by: svn
3 Replies

7. Shell Programming and Scripting

Comparing two fixed width file

Hi Guys I am checking the treads to get the answer but i am not able to get the answer for my question. I have two files. First file is a pattern file and the second file is the file i want to search in it. Output will be the lines from file2. File1: P2797f12af 44751228... (10 Replies)
Discussion started by: anshul_er
10 Replies

8. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies

9. Shell Programming and Scripting

Alter Fixed Width File

Thank u so much .Its working fine as expected. ---------- Post updated at 03:41 PM ---------- Previous update was at 01:46 PM ---------- I need one more help. I have another file(fixed length) that will get negative value (ex:-00000000003000) in postion (98 - 112) then i have to... (6 Replies)
Discussion started by: vinus
6 Replies

10. Shell Programming and Scripting

Replace using awk on fixed width file.

All, I used to use following command to replace specific location in a fixed width file. Recently looks like my command stopped working as intended. We are on AIX unix. awk 'function repl(s,f,t,v) { return substr(s,1,f-1) sprintf("%-*s", t-f+1, v) substr(s,t+1) } NR<=10 {... (3 Replies)
Discussion started by: pinnacle
3 Replies
All times are GMT -4. The time now is 01:40 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy