Shell script to Split matrix file with delimiter into multiple files


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Shell script to Split matrix file with delimiter into multiple files
# 1  
Old 12-19-2019
This should work for exactly the data sample you presented; for "thousands of columns", additional efforts need to be spent, like closing output files after appending to them:


Code:
awk -F\; '
NR == 1 {for (i=5; i<=NF; i++)  {split ($i, T, "_")
                                 COL[T[1]] = COL[T[1]] OFS  i 
                                }
        }

        {for (c in COL) {n = split (COL[c], T)
                         OUT = $(T[2])
                         for (i=3; i<=n; i++) OUT = sprintf ("%s;%s", OUT, $(T[i]))
                         print $1, $2, $3, $4, OUT  >  (c ".txt")
                        }
        }
' OFS=\; file
---------- A.txt: ----------

ID1;ID2;ID3;ID4;A_1;A_2;A_3
AA;ax;ay;az;01;04;07
BB;bx;by;bz;03;44;27

---------- B.txt: ----------

ID1;ID2;ID3;ID4;B_1;B_2;B_3
AA;ax;ay;az;02;05;08
BB;bx;by;bz;05;15;08

---------- C.txt: ----------

ID1;ID2;ID3;ID4;C_1;C_2;C_3
AA;ax;ay;az;03;06;09
BB;bx;by;bz;33;26;09

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to split data with a delimiter having chars and special chars

Hi Team, I have a file a1.txt with data as follows. dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><! The delimiter string: <SelectStatement modified='1' type='string'><! dlm="<SelectStatement modified='1' type='string'><! The above command is... (7 Replies)
Discussion started by: kmanivan82
7 Replies

2. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

3. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

4. Shell Programming and Scripting

Shell script to put delimiter for a no delimiter variable length text file

Hi, I have a No Delimiter variable length text file with following schema - Column Name Data length Firstname 5 Lastname 5 age 3 phoneno1 10 phoneno2 10 phoneno3 10 sample data - ... (16 Replies)
Discussion started by: Gaurav Martha
16 Replies

5. Shell Programming and Scripting

split file into multiple files

Hi, I have a file of the following syntax that has around 120K records that are tab separated. input.txt abc def klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango unix_0000103; abc def klm 415 439 . + . klm_mango unix_00001043; I am looking for an awk oneliner... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

6. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

7. Shell Programming and Scripting

Split file into multiple files

Hi I have a file that has multiple sequences; the sequence name is the line starting with '>'. It looks like below: infile.txt: >HE_ER tttggtgccttgactcggattgggggacctcccttgggagatcaatcccctgtcctcctgctctttgctc cgtgaaaaggatccacctatgacctctagtcctcagacccaccagcccaaggaacatctcaccaatttca >M7B_Ho_sap... (2 Replies)
Discussion started by: jdhahbi
2 Replies

8. Shell Programming and Scripting

renaming files using split with a delimiter

I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies

9. Shell Programming and Scripting

Split line to multiple files Awk/Sed/Shell Script help

Hi, I need help to split lines from a file into multiple files. my input look like this: 13 23 45 45 6 7 33 44 55 66 7 13 34 5 6 7 87 45 7 8 8 9 13 44 55 66 77 8 44 66 88 99 6 I want to split every 3 lines from this file to be written to individual files. (3 Replies)
Discussion started by: saint2006
3 Replies
Login or Register to Ask a Question
X2SYS_SOLVE(1gmt)					       Generic Mapping Tools						 X2SYS_SOLVE(1gmt)

NAME
x2sys_solve - Determine systematic corrections from crossovers SYNOPSIS
x2sys_solve -Ccolumn -TTAG -Emode [ COE_list.d ] [ -V ] [ -W ] [ -Z ] [ -bi[s|S|d|D[ncol]|c[var1/...]] ] DESCRIPTION
x2sys_solve will use the supplied crossover information to solve for systematic corrections that can then be applied per track to improve data quality. Several systematic corrections can be solved for using a least-squares approach. Note: Only one data column can be processed at the time. -T Specify the x2sys TAG which tracks the attributes of this data type. -C Specify which data column you want to process. Needed for proper formatting of the output correction table and must match the same option used in x2sys_list when preparing the input data. -E The correction type you wish to model. Choose among the following functions f(p), where p are the m parameters per track that we will fit simultaneously using a least squares approach: c will fit f(p) = a (a constant offset); records must contain cruise ID1, ID2, COE. d will fit f(p) = a + b * d (linear drift; d is distance; records must contain cruise ID1, ID2, d1, d2, COE. g will fit f(p) = a + b sin(y)^2 (1980-1930 gravity correction); records must contain cruise ID1, ID2, latitude y, COE. h will fit f(p) = a + b cos(H) + c cos(2H) + d sin(H) + e sin(2H) (magnetic heading correction); records must contain cruise ID1, ID2, heading H, COE. s will fit f(p) = a * z (a unit scale correction); records must contain cruise ID1, ID2, z1, z2. t will fit f(p) = a + b * (t - t0) (linear drift; t0 is the start time of the track); records must contain cruise ID1, ID2, t1-t0, t2-t0, COE. OPTIONS
No space between the option flag and the associated arguments. COE_list.d Name of file with the required crossover columns as produced by x2sys_list. NOTE: If -bi is used then the first two columns are expected to hold the integer track IDs; otherwise we expect those columns to hold the text string names of the two tracks. -V Selects verbose mode, which will send progress reports to stderr [Default runs "silently"]. -W Means that each input records has an extra column with the composite weight for each crossover record. These are used to obtain a weighted least squares solution [no weights]. -Z For -Ed and -Et, determine the earliest time or shortest distance for each track, then use these values as the local origin for time duration or distance calculations. The local origin is then included in the correction table [Default uses 0]. -bi Selects binary input. Append s for single precision [Default is d (double)]. Uppercase S or D will force byte-swapping. Option- ally, append ncol, the number of columns in your binary input file if it exceeds the columns needed by the program. Or append c if the input file is netCDF. Optionally, append var1/var2/... to specify the variables to be read. EXAMPLES
To fit a simple bias offset to faa for all tracks under the MGD77 tag, try x2sys_list COE_data.txt -V -TMGD77 -Cfaa -Fnc > faa_coe.txt x2sys_solve faa_coe.txt -V -TMGD77 -Cfaa -Ec > coe_table.txt To fit a faa linear drift with time instead, try x2sys_list COE_data.txt -V -TMGD77 -Cfaa -FnTc > faa_coe.txt x2sys_solve faa_coe.txt -V -TMGD77 -Cfaa -Et > coe_table.txt To estimate heading corrections based on magnetic crossovers associated with the tag MGD77 from the file COE_data.txt, try x2sys_list COE_data.txt -V -TMGD77 -Cmag -Fnhc > mag_coe.txt x2sys_solve mag_coe.txt -V -TMGD77 -Cmag -Eh > coe_table.txt To estimate unit scale corrections based on bathymetry crossovers, try x2sys_list COE_data.txt -V -TMGD77 -Cdepth -Fnz > depth_coe.txt x2sys_solve depth_coe.txt -V -TMGD77 -Cdepth -Es > coe_table.txt SEE ALSO
x2sys_binlist(1), x2sys_cross(1), x2sys_datalist(1), x2sys_get(1), x2sys_init(1), x2sys_list(1), x2sys_put(1), x2sys_report(1) GMT 4.5.7 15 Jul 2011 X2SYS_SOLVE(1gmt)