Convert a two-column list into a csv


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Convert a two-column list into a csv
# 1  
Old 10-28-2013
Convert a two-column list into a csv

Hi experts,

I have a very large (1.5M lines), sorted but unstructured list that looks like this:

Code:
process_nameA    valueA
process_nameA    valueA
...
process_nameB    valueB
process_nameB    valueB
...
process_nameN    valueN

I'd like to turn this into a csv.

The values are all numerical and can be different. I've been using a bunch of shell commands strung together to manipulate the data to a format I have now, so it would be great if it's possible to pop another shell command onto the end of what I already have to accomplish this.

Code:
process_nameA,process_nameB,...,process_nameN
valueA,valueB,...,valueN
valueA,valueB,...,valueN

Thank you!
# 2  
Old 10-28-2013
This will work for your sample file; not sure it will for larger files/ many columns = intermediate output files. You may need to close files then. XYZ is a prefix dreamt up to distinguish intermediate files; modify if need be. Try:
Code:
awk '!Arr[$1]++ {print $1 > "XYZ_"$1} {print $2 > "XYZ_"$1}' file; paste -d, XYZ_*; rm XYZ_*
process_nameA,process_nameB,process_nameN
valueA,valueB,valueN
valueA,valueB,valueN

# 3  
Old 10-28-2013
An awk approach that might work:
Code:
awk '
        {
                if ( R[$1] )
                {
                        c = R[$1]
                        c = c + 1
                        A[$1","c] = $2
                        R[$1] = c
                }
                else
                {
                        c = 1
                        A[$1","c] = $2
                        R[$1] = c
                        H[++j] = $1
                }
                m = m < c ? c : m
        }
        END {
                for ( i = 1; i <= j; i++ )
                        printf "%s", ( i == j ? H[i] : H[i] OFS )
                printf "\n"
                for ( k = 1; k <= m; k++ )
                {
                        for ( i = 1; i <= j; i++ )
                        {
                                printf "%s", ( i == j ? A[H[i]","k] : A[H[i]","k] OFS )
                        }
                        printf "\n"
                }
        }
' OFS=, file.csv

This User Gave Thanks to Yoda For This Post:
# 4  
Old 10-28-2013
Another smaller awk solution:

Code:
awk '
{
  if ($1!=P) {
     v[++C,0]=P=$1;
     F=1;
  }
  v[C, F++]=$2
  M=(F>M)?F:M
}
END {
   for(j=0;j<M;j++) {
       for(i=0;i<=C;i++)
           $i=v[i,j]
       print
   }
}' OFS=, infile

This User Gave Thanks to Chubler_XL For This Post:
# 5  
Old 10-29-2013
Thanks! I went with the shorter awk approach and it works great!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need awk or Shell script to compare Column-1 of two different CSV files and print if column-1 matche

Example: I have files in below format file 1: zxc,133,joe@example.com cst,222,xyz@example1.com File 2 Contains: hxd hcd jws zxc cst File 1 has 50000 lines and file 2 has around 30000 lines : Expected Output has to be : hxd hcd jws (5 Replies)
Discussion started by: TestPractice
5 Replies

2. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

3. Shell Programming and Scripting

Shell script to convert IP range csv into a list of single IP's

Hi All, I am looking for some help to convert a csv with IP ranges in.. in the format e.g. 1.1.1.2, 1.1.1.5 2.1.1.10, 2.1.1.20 and would be looking to output as follows: 1.1.1.2 1.1.1.3 1.1.1.4 1.1.1.5 2.1.1.10 2.1.1.11 etc etc up to 2.1.1.20 I have tried a few google... (4 Replies)
Discussion started by: zippyzip
4 Replies

4. Shell Programming and Scripting

Remove the values from a certain column without deleting the Column name in a .CSV file

(14 Replies)
Discussion started by: dhruuv369
14 Replies

5. UNIX for Dummies Questions & Answers

Creating a two column list of date pairs form a single column list

Hi all, looking for some help here. I'm what you'd call a dirty programmer. my shell scripts might be ugly, but they (usually) function... Say I have a single column text file with a list of dates (yyyymmdd) that represent the elevation of a point on that date (I work with land subsidence, so... (2 Replies)
Discussion started by: jbrandt1979
2 Replies

6. Shell Programming and Scripting

awk read column csv and search in other csv

hi, someone to know how can i read a specific column of csv file and search the value in other csv columns if exist the value in the second csv copy entire row with all field in a new csv file. i suppose that its possible using awk but i m not expertise thanks in advance (8 Replies)
Discussion started by: giankan
8 Replies

7. Shell Programming and Scripting

convert huge .xml file in .csv with specific column.

I have huge xml file in server and i want to convert it to .csv with specific column ... i have search in blog but i didn't get any usefully command. Thanks in advance (1 Reply)
Discussion started by: pareshkp
1 Replies

8. Shell Programming and Scripting

return a list of unique values of a column from csv format file

Hi all, I have a huge csv file with the following format of data, Num SNPs, 549997 Total SNPs,555352 Num Samples, 157 SNP, SampleID, Allele1, Allele2 A001,AB1,A,A A002,AB1,A,A A003,AB1,A,A ... ... ... I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Discussion started by: phoeberunner
3 Replies

9. UNIX for Dummies Questions & Answers

read a line from a csv file and convert a column to all caps

Hello experts, I am trying to read a line from a csv file that contains '.doc' and print the second column in all caps. e.g. My csv file contains: Test.doc|This is a Test|test1|tes,t2|test-3 Test2.pdf|This is a Second Test| test1|tes,t2|t-est3 while read line do echo "$line" |... (3 Replies)
Discussion started by: orahi001
3 Replies
Login or Register to Ask a Question