Rearrange fields of delimited text file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rearrange fields of delimited text file
# 8  
Old 09-07-2017
Note:
--

tr understands the newline escape sequence \n so there is no need to use a hard newline character..

so
Code:
tr ';' '\n'

can be used to translate semicolons into newlines.
# 9  
Old 09-07-2017
If Perl is an option, then one way could be:

Code:
$
$ cat input.txt
a_13;a_2;a_1;a_10
13;2;1;10
$
$
$ perl -F';' -lane 'if ($. == 1){
                        %x = map{ $F[$_] => $_ } (0..$#F);
                        @s = map{ $x{$_} } sort { (split "_", $a)[1] <=> (split "_", $b)[1] } @F;
                    }
                    print join ";", map{ $F[$s[$_]] }(0..$#F);
                   ' input.txt
a_1;a_2;a_10;a_13
1;2;10;13
$
$

# 10  
Old 09-07-2017
thank you all

@Corona688: you say "When you say "only first line", what do you mean? Clearly the second line is changed too. "
..and that what I mean

---------- Post updated at 04:24 PM ---------- Previous update was at 04:13 PM ----------

@RudiC: your solution doesn't work correctly:

1_ICD;11_ICD;15_ICD;3_ICD
a1;a11;a15;a3

Smilie

Last edited by RudiC; 09-09-2017 at 01:18 PM..
# 11  
Old 09-07-2017
Does mine?
# 12  
Old 09-07-2017
Quote:
Originally Posted by andy2000
...
...
@RudiC: your solution doesn't work correctly:

1_ICD;11_ICD;15_ICD;3_ICD
a1;a11;a15;a3

...
The Perl script will not work either because the numeric part of the header columns moved to the left of the underscore ("_") instead of the right.

So, the index to compare after the "split" function should be 0 and not 1.

Code:
$ 
$ cat input_1.txt
1_ICD;11_ICD;15_ICD;3_ICD
a1;a11;a15;a3
$ 
$ 
$ perl -F';' -lane 'if ($. == 1){
                        %x = map{ $F[$_] => $_ } (0..$#F);
                        @s = map{ $x{$_} } sort { (split "_", $a)[0] <=> (split "_", $b)[0] } @F;
                    }
                    print join ";", map{ $F[$s[$_]] }(0..$#F);
                   ' input_1.txt
1_ICD;3_ICD;11_ICD;15_ICD
a1;a3;a11;a15
$ 
$

If, however, the numeric part in the header row can be on either side of the underscore, like so:

Code:
$ 
$ cat input_2.txt
1_ICD;ICD_11;15_ICD;ICD_3
a1;a11;a15;a3
$ 
$

then things get a bit more serious Smilie

Code:
$ 
$ perl -F';' -lane 'if ($. == 1){
                        %x = map{ $F[$_] => $_ } (0..$#F);
                        %y = map{ $m = $_; ($n = $m) =~ s/\D//g; $m => $n } @F;
                        @s = map{ $x{$_} } sort { $y{$a} <=> $y{$b} } keys(%y);
                    }
                    print join ";", map{ $F[$s[$_]] }(0..$#F);
                   ' input_2.txt
1_ICD;ICD_3;ICD_11;15_ICD
a1;a3;a11;a15
$ 
$

Of course, the last script will not work for a header row that could have numeric part on both sides of the underscore, like so:

Code:
1_ICD;2_ICD_6;5_ICD;ICD_7
aa;bb;cc;dd

In that case, it has to be first determined if the numeric part of 2nd column header is 2, 6 or 26.
This User Gave Thanks to durden_tyler For This Post:
# 13  
Old 09-07-2017
Hi.

Making lots of assumptions about the input data, here is a solution that transposes the file, sorts it (in a hybrid manner), and re-transposes:
Code:
#!/usr/bin/env bash

# @(#) s1       Demonstrate sort headers, carrying data fields, datamash, msort

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C dixf datamash msort

FILE=${1-data1}
E=expected-output.txt

pl " Input data file $FILE:"
head $FILE

pl " Expected output:"
head $E

# See f3 and f2 for intermediate output.
pl " Results:"
datamash -t ';' transpose < $FILE |
tee f3 |
msort -q -l -n 1,1 -d ';' --comparison-type hybrid |
tee f2 |
datamash -t ';' transpose |
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C || ( pe; pe " Results cannot be verified." ) >&2

pl " Some detail for datamash, msort:"
dixf datamash msort

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.8 (jessie) 
bash GNU bash 4.3.30
dixf (local) 1.50
datamash (GNU datamash) 1.0.6
msort 8.53

-----
 Input data file data1:
a_13;a_2;a_1;a_10
13;2;1;10

-----
 Expected output:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Results:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Verify results if possible:

-----
 Comparison of 2 created lines with 2 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output.txt have same content.

-----
 Some detail for datamash, msort:

datamash        command-line calculations (man)
Path    : /usr/bin/datamash
Version : 1.0.6
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Help    : probably available with -h,--help
Repo    : Debian 8.8 (jessie) 
Home    : https://savannah.gnu.org/projects/datamash/ (pm)

msort   sort records in complex ways (man)
Path    : /usr/bin/msort
Version : 8.53
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Repo    : Debian 8.8 (jessie) 
Home    : http://www.billposer.org/Software/msort.html (pm)

Best wishes ... cheers, drl
# 14  
Old 09-08-2017
@Corona688 it doesen't work

@durden_tyler it doesen't work

Last edited by andy2000; 09-08-2017 at 07:39 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pattern Match and Rearrange the Fields in UNIX

For an Output like below Input : <Subject A="I" B="1039502" C="2015-06-30" D="010101010101"> Output : <Subject D="010101010101" B="1039502" C="2015-06-30" A="I"> I have been using something like below but not getting the desired output : awk -F ' ' '/Subject/ BEGIN{OFS=" ";}... (19 Replies)
Discussion started by: arunkesi
19 Replies

2. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

3. Shell Programming and Scripting

Using awk to rearrange fields

Hi, I am required to arrange columns of a file i.e make the 15th column into the 1st column. I am doing awk 'begin {fs=ofs=","} {print $15,$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14}' ad.data>ad.csv the problem is that column 15 gets to column 1 but it is not comma separated with the... (10 Replies)
Discussion started by: seddoubt
10 Replies

4. Shell Programming and Scripting

Split a free form text delimited by space to words with other fields

Hi, I need your help for below with shell scripting or perl I/P key, Sentence customer1, I am David customer2, I am Taylor O/P Key, Words Customer1,I Customer1,am Customer1,David Customer2,I Customer2,am Customer2,Taylor (4 Replies)
Discussion started by: monishathampi
4 Replies

5. Shell Programming and Scripting

Print records which do not have expected number of fields in a comma delimited file

Hi, I have a comma (,) delimited file, in which few fields are enclosed with in double quotes " ". I have to print the records in the file which donot have expected number of field with the line number. File1 ==== name,desgnation,doj,project #header#... (7 Replies)
Discussion started by: machomaddy
7 Replies

6. UNIX for Advanced & Expert Users

Problem while counting number of fields in TAB delimited file

I'm facing a strange problem, please help me out. Here we go. I want to count number of fields in particular file. filename and delimiter character will be passed through parameter. On command prompt if i type following i get 27 as output (which is correct) cat customer.dat | head -1 | awk... (12 Replies)
Discussion started by: vikanna
12 Replies

7. Shell Programming and Scripting

Rearrange the text file

Gents, I have a large file and each line of the file contains more than 200 bytes.Please let me a way to have the new line to start when the word "FIT" appears. I was trialling with 'tr' command but i am not sure how to get it based on bytes and so it wasn't working... Current... (3 Replies)
Discussion started by: appu2176
3 Replies

8. Shell Programming and Scripting

Large pipe delimited file that I need to add CR/LF every n fields

I have a large flat file with variable length fields that are pipe delimited. The file has no new line or CR/LF characters to indicate a new record. I need to parse the file and after some number of fields, I need to insert a CR/LF to start the next record. Input file ... (2 Replies)
Discussion started by: clintrpeterson
2 Replies

9. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

10. UNIX for Dummies Questions & Answers

Sort the fields in a comma delimited file

Hi, I have a comma delimited file. I want to sort the fields alphabetically and again store them in a comma delimited file. For example, My file looks like this. abc,aaa,xyz,xxx,def pqr,ggg,eee,iii,qqq zyx,lmo,pqr,abc,fff and I want my output to look like this, all fields sorted... (3 Replies)
Discussion started by: swethapatil
3 Replies
Login or Register to Ask a Question