Split columns into rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split columns into rows
# 8  
Old 03-29-2016
PFB requirements I have

INPUT FILE
Code:
10000|1442448000000:[-99]|latChangeDate:14457679000|latest:-99
10001|1442448000000:[-99]|latChangeDate:14438549000|latest:351592032174|1448496000000:[35159203213374]
10002|1442448000000:[35578404848006],latChangeDate:1442468000|latest:35578404846
10003
10004|1442448000000:[35872402070386]|latChangeDate:14425248000|latest:358722070386
10005|1448755200000:[35193106096435]|latest:351931096435

code I have used to get value is
Code:
sed -i 's/"//g' input.txt
printf "ID|latChangeDate|latest\n" >> output1.csv
while read -r line
do
id=`echo $line | grep -o '^[0-9]\+'`
printf "$id|"
if [[ $line =~ latChangeDate ]]
then
lt_cre=`echo $line | grep -o 'latChangeDate:\w\+'`
lt_cre_val=`echo $lt_cre | cut -d: -f2`
printf "${lt_cre_val}|"
else
printf "XXXXXXXXX|"
fi
if [[ $line =~ latest ]]
then
lt_cre_ac_dt=`echo $line | grep -o 'latest:\w\+'`
lt_cre_ac_dt_val=`echo $lt_cre_ac_dt | cut -d: -f2`
printf "${lt_cre_ac_dt_val}\n"
else
printf "XXXXXXXXX\n"
fi
done < /input.txt

output I got --where -99 value is missing
Code:
ID|latChangeDate|latest
10000|14457679000|
10001|14438549000|35159203214
10002|1442468000|35578404846
10003|XXXXXXXXX|XXXXXXXXX
10004|14425248000|358722070386
10005|XXXXXXXXX|351931096435

but I should get
Code:
10000|14457679000|-99
10001|14438549000|35159203214
10002|1442468000|35578404846
10003|XXXXXXXXX|XXXXXXXXX
10004|14425248000|358722070386
10005|XXXXXXXXX|351931096435

where -99 value is missing

---------- Post updated 03-29-16 at 12:23 AM ---------- Previous update was 03-28-16 at 11:56 PM ----------

Hi Ravinder,
this is an another file with negative value.

Last edited by syd; 03-29-2016 at 02:13 AM..
# 9  
Old 03-29-2016
Hello syd,

Not clear still, I am not seeing value 35159203214 in your Input_file, could you please elaborate more about your requirement so that we could help you more on this.


Thanks,
R. Singh
# 10  
Old 03-29-2016
PFb sample data

Input file
Code:
10000|1442000000:[-99]|latChangeDate:1400|latest:-99
10001|1446:[-99]|latChangeDate:144385400|latest:351592174|1448400:[35374]
10002|1424:[38006],latChangeDate:144246|latest:355746
10003
10004|14424:[358386]|latChangeDate:14425200|latest:35872386
10005|14480:[351435]|latest:351935

Output I should get is
Code:
ID|latChangeDate|latest
10000|1400|-99
10001|144385400|351592174
10002|144246|355746
10003|XXXX|XXXX
10004|14425200|35872386
10005|XXXX|351935

I have used code
Code:
sed -i 's/"//g' input.txt
printf "ID|latChangeDate|latest\n" >> output1.csv
while read -r line
do
id=`echo $line | grep -o '^[0-9]\+'`
printf "$id|"
if [[ $line =~ latChangeDate ]]
then
lt_cre=`echo $line | grep -o 'latChangeDate:\w\+'`
lt_cre_val=`echo $lt_cre | cut -d: -f2`
printf "${lt_cre_val}|"
else
printf "XXXXXXXXX|"
fi
if [[ $line =~ latest ]]
then
lt_cre_ac_dt=`echo $line | grep -o 'latest:\w\+'`
lt_cre_ac_dt_val=`echo $lt_cre_ac_dt | cut -d: -f2`
printf "${lt_cre_ac_dt_val}\n"
else
printf "XXXXXXXXX\n"
fi
done < /input.txt

for above script iam getting output as
Code:
ID|latChangeDate|latest
10000|1400|
10001|144385400|351592174
10002|144246|355746
10003|XXXX|XXXX
10004|14425200|35872386
10005|XXXX|351935

iam not getting negative value -99.
# 11  
Old 03-29-2016
Hello syd,

Could you please try following and let me know if this helps you.
Code:
awk -F"|" 'BEGIN{print "ID|latChangeDate|latest"}{for(i=2;i<=NF;i++){if($i ~ /latChangeDate/){m=sub(/.*:/,X,$i);l++;Q=Q?Q OFS $i:$i};if($i ~ /latest/){n=sub(/.*:/,X,$i);Q=Q?Q OFS $i:$i};};if(!m && !n){Q="XXXX" OFS "XXXX"} else if(!m){Q="XXXX" OFS Q} else if(!n){Q=Q OFS "XXXX"};print $1 OFS Q;Q=m=n=""}' OFS="|"  Input_file

Output will be as follows.
Code:
ID|latChangeDate|latest
10000|1400|-99
10001|144385400|351592174
10002|144246|355746
10003|XXXX|XXXX
10004|14425200|35872386
10005|XXXX|351935

EDIT: Adding a non-one liner form of solution on same now.
Code:
awk -F"|" 'BEGIN{
                        print "ID|latChangeDate|latest"
                }
                {
                        for(i=2;i<=NF;i++){
                                                if($i ~ /latChangeDate/){
                                                                                m=sub(/.*:/,X,$i);
                                                                                Q=Q?Q OFS $i:$i
                                                                        };
                                                if($i ~ /latest/){
                                                                                n=sub(/.*:/,X,$i);
                                                                                Q=Q?Q OFS $i:$i};
                                                                        };
                                                if(!m && !n)            {
                                                                                Q="XXXX" OFS "XXXX"
                                                                        }
                                                else if(!m)             {
                                                                                Q="XXXX" OFS Q
                                                                        }
                                                else if(!n)             {
                                                                                Q=Q OFS "XXXX"
                                                                        };
                                                print $1 OFS Q;
                                                Q=m=n=""
                }
           ' OFS="|"   Input_file

Hope this helps you.

Thanks,
R. Singh

Last edited by RavinderSingh13; 03-29-2016 at 06:52 AM.. Reason: Adding a non-one liner form of solution now. Removed an unnecessary variable now.
This User Gave Thanks to RavinderSingh13 For This Post:
# 12  
Old 03-29-2016
Thank you Ravinder.

its working Smilie
# 13  
Old 03-29-2016
Try

Input
Code:
[akshay@localhost tmp]$ cat f
10000|1442000000:[-99]|latChangeDate:1400|latest:-99
10001|1446:[-99]|latChangeDate:144385400|latest:351592174|1448400:[35374]
10002|1424:[38006],latChangeDate:144246|latest:355746
10003
10004|14424:[358386]|latChangeDate:14425200|latest:35872386
10005|14480:[351435]|latest:351935


command

Code:
[akshay@localhost tmp]$ awk -vext='latChangeDate,latest' '
BEGIN{
    FS=OFS="|"
    split(ext,s,/,/)
    print "ID|latChangeDate|latest"
}
{ 
    str=$1 
    for(i=1; i in s; i++)
    str = str OFS ((match($0,s[i]":[^|]*"))? (substr($0,RSTART+length(s[i])+1,RLENGTH-(length(s[i])+1))) :"XXXX") 
    print str 
}'  f


output

Code:
ID|latChangeDate|latest
10000|1400|-99
10001|144385400|351592174
10002|144246|355746
10003|XXXX|XXXX
10004|14425200|35872386
10005|XXXX|351935

# 14  
Old 03-29-2016
Here's a quick and dirty shell script to do the job. Handles any data.

Code:
 #!/bin/ksh
 infile="convertdata"
 outfile="converteddata"
 function filecleanup
 {
 if [[ -f $1 ]] ; then
 rm $1
 fi
 }
 function writeoutput
 {
 i=1
 while :;
 do
 print $part_dt | cut -d, -f$i | read value
 if [[ $value == '' ]] then
 return;
 fi
 set +x
 print $party"|"$value
 ((i += 1))
 done
 }
 filecleanup $outfile
 echo "Party|Part_dt"
 cat $infile | while read rec
 do
 echo $rec | cut -d'|' -f1 | read party
 echo $rec | cut -d'|' -f2 | read part_dt
 part_dt=${part_dt#[} # Strip initial bracket
 part_dt=${part_dt%]} # Strip trailing bracket
 part_dt=${part_dt}"," # cut has strange behavior when no delimiter, so make sure each record has a delimiter
 writeoutput #$party $part_dt
 done

---------- Post updated at 01:55 PM ---------- Previous update was at 01:53 PM ----------

Output from above script:

Code:
Party|Part_dt
10000|12080000000
10002|13075200000
10002|-999
10003|13939200000
10004|1347200000
10004|133600000
10004|1152000000
10004|106400000
10004|12800000
10004|117200000
10004|145180000
10004|1451000000
10004|148400000
10004|14240000
10005|16000000


Last edited by Don Cragun; 03-29-2016 at 05:41 PM.. Reason: Get rid of SIZE and FONT tags inside CODE tags.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split multi columns line to 2 columns

I have data like this 1 a,b,c 2 a,c 3 b,d 4 e,f would like convert like this 1 a 1 b 1 c 2 a 2 c 3 b 3 d 4 e 4 f Please help me out (4 Replies)
Discussion started by: jhonnyrip
4 Replies

2. Shell Programming and Scripting

How to split all columns into multiple columns?

Hi, all. How can I split all columns into multiple columns separated by tab? Input: qq ee TT 12 m1 aa gg GG 34 2u zz dd hh 56 4h ww cc JJ 78 5y ss ff kk 90 j8 xx pp mm 13 p0 Output: q q e e T T 1 2 m 1 a a g g G G 3 4 2 u z z d d h h 5 6 4 h w w c c J J 7 8 5 y (8 Replies)
Discussion started by: huiyee1
8 Replies

3. Shell Programming and Scripting

awk split columns after matching on rows and summing the last column

input: chr1 1 2 3 chr1 1 2 4 chr1 2 4 5 chr2 3 6 9 chr2 3 6 10 Code: awk '{a+=$4}END{for (i in a) print i,a}' input Output: chr112 7 chr236 19 chr124 5 Desired output: chr1 1 2 7 chr2 3 6 19 chr1 2 4 5 (1 Reply)
Discussion started by: jacobs.smith
1 Replies

4. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

5. Shell Programming and Scripting

Evaluate 2 columns, add sum IF two columns match on two rows

Hi all, I know this sounds suspiciously like a homework course; but, it is not. My goal is to take a file, and match my "ID" column to the "Date" column, if those conditions are true, add the total number of minutes worked and place it in this file, while not printing the original rows that I... (6 Replies)
Discussion started by: mtucker6784
6 Replies

6. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

7. Shell Programming and Scripting

split paste them in rows

Hi, I have a file as ABC 123_456_789 234_678_901 XYZ 1100_1250_1580_1680 1175_1440_1620_1890 so on What I want my output file to look is "split by underscore and then place the contents in rows" output ABC 123 234 ABC 456 678 ABC 789 901 XYZ 1100 1175 XYZ 1250 1440... (3 Replies)
Discussion started by: Diya123
3 Replies

8. Shell Programming and Scripting

Split single rows to multiple rows ..

Hi pls help me out to short out this problem rm PAB113_011.out rm: PAB113_011.out: override protection 644 (yes/no)? n If i give y it remove the file. But i added the rm command as a part of ksh file and i tried to remove the file. Its not removing and the the file prompting as... (7 Replies)
Discussion started by: sri_aue
7 Replies

9. Shell Programming and Scripting

Split rows

Hi all, I need a simple bin/sh script FILE1: ab1 gegege swgdeyedg ac2 jxjjxjxjxxjxjx ad3 ae4 xjxjxj zhzhzh ahahs af5 sjsjsjs ssjsjsjsj sjsjsj ag6 shshshshs sjjssj shhshshs myScript.sh has to return: ROW ab1 ROW ac2 ROW ad3 ROW ae4 In other words: "ROW " + the first world... (3 Replies)
Discussion started by: ric79
3 Replies

10. Shell Programming and Scripting

split rows

Hi I wanted to split rows based on the number of 1's present in 21st field(21st field is 40 length field) so I wrote the below awk code. However, the tool that I am using to invoke the command is not recognising the command. So, could you please help me to translate this command to sed? awk... (5 Replies)
Discussion started by: ahmedwaseem2000
5 Replies
Login or Register to Ask a Question