Text processing of file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Text processing of file
# 1  
Old 11-25-2011
Network Text processing of file

I have a text file which is a dataset. and I need to convert it into a CSV format
The file is as follows :
First line :
Code:
-1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1

Second line "
Code:
+1 5:1 11:1 15:1 32:1 39:1 40:1 52:1 63:1 67:1 73:1 74:1 76:1 78:1 83:1

There are a total of 123 columns, of which only the ones which have value 1 are shown here. the remaining columns are 0 s.

So I would like a CSV file of the following format :" with the -1 in the beginning of the row replaced with 0 and +1 replaced with 1
Code:
0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 .... 
1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 .....

Can anyone help me out?

Last edited by Franklin52; 11-28-2011 at 03:33 PM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 11-25-2011
Code:
awk '{
   if ($1=="1") {
      printf "1"
   } else {
      printf "0"
   }
   prev=1;
   for (i=2;i<=NF;i++) {
      split ($i, colval, ":");
      for (j=prev+1; j<colval[1]; j++) {
         printf ",0"
      }
      printf ",%s", colval[2];
      prev=$i
   }
   for (j=prev+1; j<123; j++) {
         printf ",0"
   }
   printf "\n";
}' inputfile

# 3  
Old 11-26-2011
Hello,

I am getting uneven no of values per row in the CSV file
There are 123 features.. so I guess there would be 123 0's 0r 1.
and then the class label -1 or +1 also converted to 0 or 1

I am attaching the two sample lines of the input file and the output file...

---------------------------------------------------------------------------------------------------
Input file - two lines :
Code:
-1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1 
-1 1:1 6:1 14:1 22:1 36:1 42:1 49:1 64:1 67:1 72:1 74:1 77:1 80:1 83:1

and the current outputs to these two lines are :

Code:
0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,1,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

AS you may be able to noitce , there are uneven no of rows !!
# 4  
Old 11-26-2011
Try this...
Code:
awk '{
        sub(/-1/,0)?NULL:sub(/+1/,1)
        printf $1
        for(i=2;i<=NF;i++){
                split($i,arr,":")
                for(j=last+1;j<arr[1];j++) printf " 0"
                printf " "arr[2]
                last=arr[1]
        }
        for(i=last+1;i<=123;i++) printf " 0"
        last=0;printf "\n"
}' input_file

--ahamed

Last edited by ahamed101; 11-26-2011 at 10:03 AM..
# 5  
Old 11-26-2011
Code:
awk '{x=($1<0)?0:(($1>0)?1:$1);split($0,a);for(i=1;++i<=NF;){split(a[i],b,":");$(b[1]+1)=b[2]};for(i=1;++i<=123;){$i=$i==1?1:0};$1=x}1' OFS="," file

# 6  
Old 11-28-2011
There were a couple of errors (or 4) in my previous solution. Smilie

Code:
{
   if ($1=="+1") {
      printf "1"
   } else {
      printf "0"
   }
   prev=0;
   for (i=2;i<=NF;i++) {
      split ($i, colval, ":");
      for (j=prev+1; j<colval[1]; j++) {
         printf ",0"
      }
      printf ",%s", colval[2];
      prev=colval[1];
   }
   for (j=prev+1; j<=123; j++) {
         printf ",0"
   }
   printf "\n";
}


Last edited by CarloM; 11-28-2011 at 06:45 AM..
This User Gave Thanks to CarloM For This Post:
# 7  
Old 01-26-2012
Thanks

Hello,

Thanks a lot. It is working properly now. I will mark this thread as solved.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Text File with Binary Values processing

Hello all, I have a txt file containing millions of lines. Below is the example: {tx:be} head -50 file.txt Instr1: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Instr1:... (6 Replies)
Discussion started by: Zam_1234
6 Replies

2. Shell Programming and Scripting

Grep -c text processing of a log file

I have a log file with below format. Log File: 1 started job on date & time JOB-A 2 started job on date & time JOB-B 3 completed job on data & time JOB-A 4 started job on date & time JOB-C 5 started job on date & time JOB-D 6 completed job on data & time JOB-B 7 started job on date &... (8 Replies)
Discussion started by: ctrld
8 Replies

3. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

4. UNIX for Dummies Questions & Answers

Take output of processing in text file

Hi ALL, I am presently using perl script mukesh.pl I just want to catch its output into another text file . So I am using > File.txt . I am getting output but i want the whole processing of the script into that file please let me know . Thanks in advance Cheers Mukesh (1 Reply)
Discussion started by: mumakhij
1 Replies

5. UNIX for Advanced & Expert Users

perl text file processing using hash

Hi Experts, I have this requirement to process large files (200MB+).Format of files is like: recordstart val1 1 val2 2 val3 4 recordstart val1 5 val2 6 val3 1 val4 1 recordstart val1 ... (4 Replies)
Discussion started by: mtomar
4 Replies

6. Shell Programming and Scripting

awk, perl Script for processing a single line text file

I need a script to process a huge single line text file: The sample of the text is: "forward_inline_item": "Inline", "options_region_Australia": "Australia", "server_event_err_msg": "There was an error attempting to save", "Token": "Yes", "family": "Family","pwd_login_tab": "Enter Your... (1 Reply)
Discussion started by: hmsadiq
1 Replies

7. Shell Programming and Scripting

processing file names using text files

Hi, I have to perform an iterative function on a set of 10 files. After the first round the output files are named differently than the input files. examples input file name = xxxx1.yyy output file name = xxxx1_0001.yyy I need to rename all of the output files to the original input... (5 Replies)
Discussion started by: ligander
5 Replies

8. Shell Programming and Scripting

KSH script -text file processing NULL issues

I'm trying to strip any garbage that may be at the end of my text file and that part is working. The problem only seems to be with the really long lines in the file. When the head command is executed I am directing the output to a new file. The new file always get a null in the 4096 position but... (2 Replies)
Discussion started by: geauxsaints
2 Replies

9. UNIX for Dummies Questions & Answers

text file processing

Hello! There is a text file, that contains hierarchy of menues, like: Aaaaa->Bbbbb Aaaaa->Cccc Aaaaa-> {spaces} Ddddd (it means that the full path is Aaaaa->Cccc->Ddddd ) Aaaaa-> {more spaces} Eeeee (it means that the full path is Aaaaa->Cccc->Ddddd->Eeeee ) Fffffff->Ggggg... (1 Reply)
Discussion started by: alias47
1 Replies

10. UNIX for Dummies Questions & Answers

Processing a text file

A file contains one name per line, such as: john doe jack bruce nancy smith sam riley When I 'cat' the file, the white space is treated as a new line. For example list=`(cat /path/to/file.txt)` for items in $list do echo $items done I get: john doe (1 Reply)
Discussion started by: TheCrunge
1 Replies
Login or Register to Ask a Question