Sponsored Content
Top Forums Shell Programming and Scripting Making a composite file of transposed columns Post 302898545 by LMHmedchem on Tuesday 22nd of April 2014 02:52:58 PM
Old 04-22-2014
Making a composite file of transposed columns

Hello,

I have a directory with allot of tab delimited text files that have data that look like,
Code:
filePath	distance
(1,4-dioxan-2-ylmethyl)methylamine	0.0
4-methylmorpholine	0.0755473632594
1-propyl-4-piperidone	0.157792911954
heptaminol	0.158142893249
N-acetylputrescine	0.158689628956
spermidine	0.170417125303

for simplicity, I have include the first 7 rows of the file, but there are 25.

I need to sort through these and extract the string at column 1 row 2 and column 2 rows 3 through n. I then need to transpose this data so that there is one row for each input file.

For the data above, the row would look like,
Code:
(1,4-dioxan-2-ylmethyl)methylamine	0.075547363	0.157792912	0.158142893	0.158689629	0.170417125

There are many files, so if the second input file was,
Code:
filePath	distance
(1-methyl(4-piperidyl))(3-pyridylmethyl)amine	0.0
lidocaine	0.0971033747257
methoxyphenamine	0.106031307815
meperidine	0.107826404718
fenspiride	0.118603492524
tetracaine	0.122268535847

The output for both files would look like,
Code:
(1,4-dioxan-2-ylmethyl)methylamine	0.075547363	0.157792912	0.158142893	0.158689629	0.170417125
(1-methyl(4-piperidyl))(3-pyridylmethyl)amine	0.097103375	0.106031308	0.107826405	0.118603493	0.122268536

I'm guessing that it could be done with awk, but I have never transposed columns before. My alternative is excel, so suggestions would be appreciated. I have about 1500 files to sort through, so this will take a while if I can't get something automated.

thanks,

LMHmedchem
 

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

CSV Table Filtered/transposed/matched using CSH

Hello Everybody: I have a csv file that i would want to be converted to a table (csv also) filtered, transposed and matched with the header (quite confusing, sorry). So the output can used on a spreadsheet and plot on a grap. I'm using CSH on unix. To further explain, here is an example input... (2 Replies)
Discussion started by: elmer1503
2 Replies

2. UNIX for Advanced & Expert Users

Relational Join (Composite Key)

hi, i have file 1: ====== 0501000|X1 0502000|X2 0501231|X3 0981222|X4 0502000|X6 0503000|X7 0932322|X8 file 2: ======= 050 0501 0502 09 098 (1 Reply)
Discussion started by: magedfawzy
1 Replies

3. UNIX for Dummies Questions & Answers

Removing columns from a text file that do not have any values in second and third columns

I have a text file that has three columns. But at the end of the text file, there are trailing lines that have missing second and third columns: 4 0.04972604 KLHL28 4 0.0497332 CSTB 4 0.04979822 AIF1 4 0.04983331 DECR2 4 0.04990344 KATNB1 4 4 4 4 How can I remove the trailing... (3 Replies)
Discussion started by: evelibertine
3 Replies

4. Shell Programming and Scripting

Reading columns, making a new file using another as template

Hi fellas, I have two files such as: File 1 interacao,AspAsp,AspCys,CysAsp,CysCys,classe File 2 interacao,AspAsp,CysAsp,AspCys,CysCys,classe beta_alfa, DA, CA, DD, CD,ppi Thus, I want to make a File 3 using the File 1 as model: e.g. File 3... (2 Replies)
Discussion started by: valente
2 Replies

5. UNIX for Advanced & Expert Users

Help in Deleting columns and Renaming Mutliple columns in a .Csv File

Hi All, i have a .Csv file in the below format startTime, endTime, delta, gName, rName, rNumber, m2239max, m2239min, m2239avg, m100016509avg, m100019240max, metric3min, m100019240avg, propValues 11-Mar-2012 00:00:00, 11-Mar-2012 00:05:00, 300.0, vma3550a, a-1_CPU Index<1>, 200237463, 0.0,... (9 Replies)
Discussion started by: mahi_mayu069
9 Replies

6. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

7. Shell Programming and Scripting

How to concatenate 2-columns by 2 -columns for a text file?

Hello, I want to concatenate 2-columns by 2-columns separated by colon. How can I do so? For example, I have a text file containing 6 columns separated by tab. I want to concatenate column 1 and 2; column 3 and 4; column 5 and 6, respectively, and put a colon in between. input file: 1 0 0 1... (10 Replies)
Discussion started by: huiyee1
10 Replies
RS(1)							    BSD General Commands Manual 						     RS(1)

NAME
rs -- reshape a data array SYNOPSIS
rs [-CcSs[x]] [-GgKkw N] [-EeHhjmnTtyz] [rows [cols]] DESCRIPTION
rs reads the standard input, interpreting each line as a row of blank-separated entries in an array, transforms the array according to the options, and writes it on the standard output. With no arguments it transforms stream input into a columnar format convenient for terminal viewing. The shape of the input array is deduced from the number of lines and the number of columns on the first line. If that shape is inconvenient, a more useful one might be obtained by skipping some of the input with the -k option. Other options control interpretation of the input col- umns. The shape of the output array is influenced by the rows and cols specifications, which should be positive integers. If only one of them is a positive integer, rs computes a value for the other which will accommodate all of the data. When necessary, missing data are supplied in a manner specified by the options and surplus data are deleted. There are options to control presentation of the output columns, including transposition of the rows and columns. The options are as follows: -C[x] Output columns are delimited by the single character x. A missing x is taken to be '^I'. -c[x] Input columns are delimited by the single character x. A missing x is taken to be '^I'. -E Consider each character of input as an array entry. -e Consider each line of input as an array entry. -GN The gutter width has N percent of the maximum column width added to it. -gN The gutter width (inter-column space), normally 2, is taken to be N. -H Like -h, but also print the length of each line. -h Print the shape of the input array and do nothing else. The shape is just the number of lines and the number of entries on the first line. -j Right adjust entries within columns. -KN Like -k, but print the ignored lines. -kN Ignore the first N lines of input. -m Do not trim excess delimiters from the ends of the output array. -n On lines having fewer entries than the first line, use null entries to pad out the line. Normally, missing entries are taken from the next line of input. -S[x] Like -C, but padded strings of x are delimiters. -s[x] Like -c, but maximal strings of x are delimiters. -T Print the pure transpose of the input, ignoring any rows or cols specification. -t Fill in the rows of the output array using the columns of the input array, that is, transpose the input while honoring any rows and cols specifications. -wN The width of the display, normally 80, is taken to be the positive integer N. -y If there are too few entries to make up the output dimensions, pad the output by recycling the input from the beginning. Normally, the output is padded with blanks. -z Shrink column widths to fit the largest entries appearing in them. With no arguments, rs transposes its input, and assumes one array entry per input line unless the first non-ignored line is longer than the display width. Option letters which take numerical arguments interpret a missing number as zero unless otherwise indicated. EXAMPLES
rs can be used as a filter to convert the stream output of certain programs (e.g., spell, du, file, look, nm, who, and wc(1)) into a conve- nient ``window'' format, as in $ who | rs This function has been incorporated into the ls(1) program, though for most programs with similar output rs suffices. To convert stream input into vector output and back again, use $ rs 1 0 | rs 0 1 A 10 by 10 array of random numbers from 1 to 100 and its transpose can be generated with $ jot -r 100 | rs 10 10 | tee array | rs -T >tarray In the editor vi(1), a file consisting of a multi-line vector with 9 elements per line can undergo insertions and deletions, and then be neatly reshaped into 9 columns with :1,$!rs 0 9 Finally, to sort a database by the first line of each 4-line field, try $ rs -eC 0 4 | sort | rs -c 0 1 SEE ALSO
jot(1), pr(1), sort(1), vi(1) BUGS
Handles only two dimensional arrays. The algorithm currently reads the whole file into memory, so files that do not fit in memory will not be reshaped. Fields cannot be defined yet on character positions. Re-ordering of columns is not yet possible. There are too many options. BSD
April 14, 2012 BSD
All times are GMT -4. The time now is 02:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy