Visit Our UNIX and Linux User Community


Aligning text files by max field length


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Aligning text files by max field length
# 1  
Old 08-11-2011
Aligning text files by max field length

Hello,
Is there anyway that I can align a pipe delimited text file by the maxium field length where the field is separated out by pipes for large text files with more than
100,000 rows?

So, far I have searched other forums and google about aligning text files in unix and I have noticed that several other users use the awk utility. Since I am new to awk
I have attempted in writing my own code after reading some of the awk utility syntax, but I am getting stuck.
If awk is not the best utility to achieve this is there any way to code this???

My test code:

#!/bin/ksh
awk 'BEGIN {FS = "|"}
{
for(i=1;i<=NF;i++)
{
if (length($i) > max)
max = length($i)
maxlen($i) = max
}
}
END
{
for (i in max) print (i,max)
}
' $(find . -name "testfile.txt")

Below is a sample of the text file that I have:

Pipe Delimited Text file
YEAR|NAME|PRODUCT_ID|ORDER_ID|CUSTOMER_ID
2001|Unix book|12354|01587|5487651484
2002|Programming|65487|6564548|654365146
2003|Airsoft Guns|6544888|548|65498
2004|Video Games|101100018|44|648
2010|Wayside Stories from wayside school|5487454|4|64645646
.
.
.

Desired Output:

YEAR|NAME |PRODUCT_ID|ORDER_ID|CUSTOMER_ID
2001 |Unix book |12354 |01587 |5487651484
2002 |Programming |65487 |6564548 |654365146
2003 |Airsoft Guns |6544888 |548 |65498
2004 |Video Games |101100018 |44 |648
2010 |Wayside Stories from wayside school|5487454 |4 |64645646
.
.
.

Thanks,
# 2  
Old 08-11-2011
Try and adapt this awk script :
Code:
awk -v FS='|' -v OFS='|' '
    {
        if (MaxFields < NF) MaxFields = NF;
        for (i=1; i<=NF; i++) {
            Field[NR, i] = $i;
            l = length($i);
            if (Length[i] < l) Length[i] = l;
        };
    }
    END {
        for (i=1; i<=MaxFields; i++) Format[i] = "%-" Length[i]+0 "s";
        for (n=1; n<=NR; n++) {
            out = "";
            for (i=1; i<=MaxFields; i++) {
                out = out OFS sprintf(Format[i], Field[n, i]);
            }
            print substr(out,2);
        }
    }

' inputfile

Inputfile :
Code:
YEAR|NAME|PRODUCT_ID|ORDER_ID|CUSTOMER_ID
2001|Unix book|12354|01587|5487651484
2002|Programming|65487|6564548|654365146
2003|Airsoft Guns|6544888|548|65498
2004|Video Games|101100018|44|648
2010|Wayside Stories from wayside school|5487454|4|64645646

Output :
Code:
YEAR|NAME                               |PRODUCT_ID|ORDER_ID|CUSTOMER_ID
2001|Unix book                          |12354     |01587   |5487651484
2002|Programming                        |65487     |6564548 |654365146
2003|Airsoft Guns                       |6544888   |548     |65498
2004|Video Games                        |101100018 |44      |648
2010|Wayside Stories from wayside school|5487454   |4       |64645646


Jean-Pierre.
This User Gave Thanks to aigles For This Post:
# 3  
Old 08-11-2011
Hello,

Thanks for helping me out. I have tried the code and I have received the following errors???

awk: syntax error near line 1
awk: bailing out near line 1

Thanks,
# 4  
Old 08-11-2011
Use nawk instead of awk.
# 5  
Old 08-11-2011
Wow that is cool. nawk works!!!! Thanks so much. This is just what I needed.
# 6  
Old 08-11-2011
Column can help too:
Code:
~/unix.com$ echo 'YEAR|NAME|PRODUCT_ID|ORDER_ID|CUSTOMER_ID
2001|Unix book|12354|01587|5487651484
2002|Programming|65487|6564548|654365146
2003|Airsoft Guns|6544888|548|65498
2004|Video Games|101100018|44|648
2010|Wayside Stories from wayside school|5487454|4|64645646' | column -ts'|' | sed -r 's/  ([[:alnum:]])/|\1/g'

This User Gave Thanks to tukuyomi For This Post:
# 7  
Old 08-11-2011
Thanks for showing me the column command, as I have not heard of this command before, but it appears that I got an error out of it when I have attempted to use it.

sed: illegal option -- r
col.ksh[7]: syntax error at line 7 : `'' unmatched

Thanks,

Previous Thread | Next Thread
Test Your Knowledge in Computers #932
Difficulty: Easy
Most first-generation personal computers did not keep track of dates and times.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find max length of the field and then replace zero

hai guys, pick the 1st field and calculate max length. if suppose max length is 2, then compare the all records if <2 then add zero's prefix of the record. for ex: s.no,sname 1,djud 37,jtuhe in this max length of the 1st field is 2 right the output wil be s.no,sname 01,djud... (6 Replies)
Discussion started by: Suneelbabu.etl
6 Replies

2. Shell Programming and Scripting

Read text between regexps and write into files based on a field in the text

Hi, I have a huge file that has data something like shown below: huge_file.txt start regexp Name=Name1 Title=Analyst Address=Address1 Department=Finance end regexp some text some text start regexp Name=Name2 Title=Controller Address=Address2 Department=Finance end regexp (7 Replies)
Discussion started by: r3d3
7 Replies

3. Shell Programming and Scripting

Aligning columns in a text file using Perl

Hi All, I am new to perl and was trying to write a simple program which will generate a text file as output.. now the output which i am getting is something like this.. ================================================================================================== Col1 ... (8 Replies)
Discussion started by: smarty86
8 Replies

4. Shell Programming and Scripting

Flat file-make field length equal to header length

Hello Everyone, I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table. Input Data- ------ ------------------------ ---- ----------------- WFI001 Xxxxxx Control Work Item A Number of Records ------ ------------------------... (5 Replies)
Discussion started by: sonali.s.more
5 Replies

5. UNIX for Dummies Questions & Answers

Modify the max username length

Hey Any one... Do u know any way I can modify the max username length in unix? I guess it is 32/64 characters by default. Suppose I want to increase it to 128. i hav tried /etc/skel but no use... How can I do that? (2 Replies)
Discussion started by: MayureshRisbud
2 Replies

6. UNIX for Advanced & Expert Users

How to increase max username length?

Hi, This is my first post to this site. So kindly forgive if I am writing in a wrong section. My query is that... I want to modify the max username length size. I guess it is 32/64 on CentOS. Now I want to change it to 128. Is there any way to do that? Thanks in advance!! :) (4 Replies)
Discussion started by: ajay303
4 Replies

7. Shell Programming and Scripting

Counting the max length of string

Hi all, I have a flat file of 1000 rows. I want to check the length of the 5th column. The one having the longest length , I want to set it as DEFINED PARAMETER. So later I can check others with that particular number only. Any ideas ?? (2 Replies)
Discussion started by: ganesh123
2 Replies

8. UNIX for Dummies Questions & Answers

Read file and remove max length

Hi all, I tried to write a shell to read huge file and eliminate max length record which is wrong generated record. But I get an error remove_sp.sh: line 27: syntax error near unexpected token `else' remove_sp.sh: line 27: ` else $LINE >> REJFILE' My shell is here: #!/bin/sh... (5 Replies)
Discussion started by: mr_bold
5 Replies

9. Shell Programming and Scripting

what is the max length of args i can pass in shell?

i have a shell script which takes several args. what is the maximum length of string i can give as argument? (6 Replies)
Discussion started by: senthilk615
6 Replies

10. UNIX for Dummies Questions & Answers

Length of a Unix filepath max length

Hi Guys, Could anyone shed some light on the length of a Unix filepath max length pls ? thanks ! Wilson (3 Replies)
Discussion started by: wilsontan
3 Replies

Featured Tech Videos