Data manipulation using shell


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Data manipulation using shell
# 1  
Old 01-13-2013
Data manipulation using shell

Dear all

I have a dataset (in text format,delimited by tab) which have 100 variables (say, var0-var99) and more than 100,000 observations. I want to do the following:

1. for variable var0-var49, I want to add "00" in front of each data (for example, "1" would become "001")

2. for variable var50-var99, I want to add an underscore _ in front of each data (for example, "1" would become "_1")

How should I write the script?
# 2  
Old 01-13-2013
Please give us a concrete example of your input file format. (Use CODE tags.)
# 3  
Old 01-13-2013
Thanks.

The raw data is like:

Code:
Var0 Var1 Var2 ... Var50 Var51 ... Var99
1 22 53 ... 3 76 ... 82
.
.
.
.
22 78 65 ... 89 7 ... 12

and I hope, after running code, the data will look like:

Code:
Var0 Var1 Var2 ... Var50 Var51 ... Var99
001 0022 0053 ... _3 _76 ... _82
.
.
.
.
0022 0078 0065 ... _89 _7 ... _12

# 4  
Old 01-13-2013
Assuming that your actual data file has no headers:
Code:
awk -F'\t' '{for(i=1;i<=50 && i<=NF;i++) $i="00"$i;for(;i<=NF;i++) $i="_"$i}1' OFS='\t' file


Last edited by elixir_sinari; 01-13-2013 at 02:41 PM..
# 5  
Old 01-13-2013
Code:
awk -F'\t' '{ for(i=1;i<=NF;i++) (i<=50)?$i="00"$i:$i="_"$i; }1' OFS='\t' file

# 6  
Old 01-13-2013
This is certainly not as elegant as I wanted it to be and as above proposals:
Code:
$ sed 's/\t\|^/&_/g; s/_/X/51; h; s/X.*$//; s/_/00/g; G; s/\n.*X/_/' file

I was erroneously thinking the s///NUMBER flag would allow for ranges like 1-50, but it doesn't, does it? So the entire thing ended up clumsy...
# 7  
Old 01-14-2013
Sorry for the late reply and thank you all for great help.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Data manipulation, Please help..

Hello, I have a huge set of data that needs to be reformatted. Here is a simple example to explain the process. I have number n=5 and a input with many numbers separated with comma: ... (11 Replies)
Discussion started by: liuzhencc
11 Replies

2. Shell Programming and Scripting

[Solved] Data manipulation

Hallo Team, I need your help. I have a file that has two colums. See sample below: 105550 0.28 105550 0.24 125550 0.28 125550 0.24 215650 0.28 215650 0.24 315550 0.28 315550 0.24 335550 0.28 335550 0.24 40555 0.21 40555 0.17 415550 0.21 415550 0.17 43555 0.21 43555 0.17 (5 Replies)
Discussion started by: kekanap
5 Replies

3. UNIX for Dummies Questions & Answers

Data Manipulation

Dear Sir, I have file input RGR001|108.28|-2.86489|100-120|RANGGAR RGR002|108.071|-2.69028|80-100|RANNGAR RGR003|108.168|-2.97053|50-80|RANNGAR RGR007|108.192722222|-2.766138889|0-50|RANGGARI want to create files by joining each rows with each rows below Output as below ... (4 Replies)
Discussion started by: radius
4 Replies

4. UNIX for Dummies Questions & Answers

Data manipulation

Hallo Team, I need to manipulate existing data file. Have a look at current data and expected data: Current Data: 27873517141 27873540000 27873515109 27873517140 27873540001 27873540000 27873501343 27873540000 27873517140 27873511292 27873645989 27873540000 27873540000... (7 Replies)
Discussion started by: kekanap
7 Replies

5. UNIX for Dummies Questions & Answers

Script for data manipulation

Hi all! my first post here, so mods -- if this should ideally be in the scripts section, please move there. Thanks! I have data in the following format: key1:value1 key2:value2 key3:value3 A B C D key1:value4 key2:value5 key3:value6 A1 B1 key1: ... and so on I want an output... (2 Replies)
Discussion started by: gnat01
2 Replies

6. Shell Programming and Scripting

Data manipulation from a file

i have a file in follwing format 0110 1020 1011 1032 1020 2005 2003 1050 i want the output in such a way that all non zero numbers will be converted into 1 like this 0110 1010 1011 1011 1010 1001 1001 1010 (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

7. Shell Programming and Scripting

Data manipulation from one file

HI all i have a file consisting of following numbers 0000 0000 0000 0000 0000 1010 0000 0100 0000 0000 0000 1111 0000 1010 0000 0100 (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

8. Shell Programming and Scripting

Tricky data manipulation...

Hi everyone.. I am new here, hello.. I hope this doesn't come across to you folks as a stupid question, I'm somewhat new to scripting :) I'm seeking some help in finding a way to manipulate data output for every two characters - example: numbers.lst contains the following output:... (3 Replies)
Discussion started by: explicit
3 Replies

9. UNIX for Dummies Questions & Answers

Data Manipulation

Hello I am currently having problems in mapulating a certain file which contains vaious data. Belos is a sample content Event=<3190> Client IP=<151.111.11.143> DNS=<abc.sbc.com> TransCount=<139> Client IP=<150.222.133.163> DNS=<xyz.yuu.com> TransCount=<3734> Event=<3120> Client... (11 Replies)
Discussion started by: khestoi
11 Replies

10. Shell Programming and Scripting

Data manipulation in perl

Hello guys.. I have the following question. lets have that i have the following variable: $field=werfiurd383nd93bc93 c93 d93 d9e3 ddd or array=werfiurd383nd93bc93 c93 d93 d9e3 ddd what i would like to do is to store the first 4 characters of gthe aboce variable in variable... (1 Reply)
Discussion started by: chriss_58
1 Replies
Login or Register to Ask a Question