Rearrange fields of delimited text file Post: 303003131

Sponsored Content

Top Forums Shell Programming and Scripting Rearrange fields of delimited text file Post 303003131 by drl on Thursday 7th of September 2017 09:36:29 PM

09-07-2017

Registered User

Hi.

Making lots of assumptions about the input data, here is a solution that transposes the file, sorts it (in a hybrid manner), and re-transposes:

Code:

#!/usr/bin/env bash

# @(#) s1       Demonstrate sort headers, carrying data fields, datamash, msort

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C dixf datamash msort

FILE=${1-data1}
E=expected-output.txt

pl " Input data file $FILE:"
head $FILE

pl " Expected output:"
head $E

# See f3 and f2 for intermediate output.
pl " Results:"
datamash -t ';' transpose < $FILE |
tee f3 |
msort -q -l -n 1,1 -d ';' --comparison-type hybrid |
tee f2 |
datamash -t ';' transpose |
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C || ( pe; pe " Results cannot be verified." ) >&2

pl " Some detail for datamash, msort:"
dixf datamash msort

exit 0

producing:

Code:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.8 (jessie) 
bash GNU bash 4.3.30
dixf (local) 1.50
datamash (GNU datamash) 1.0.6
msort 8.53

-----
 Input data file data1:
a_13;a_2;a_1;a_10
13;2;1;10

-----
 Expected output:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Results:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Verify results if possible:

-----
 Comparison of 2 created lines with 2 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output.txt have same content.

-----
 Some detail for datamash, msort:

datamash        command-line calculations (man)
Path    : /usr/bin/datamash
Version : 1.0.6
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Help    : probably available with -h,--help
Repo    : Debian 8.8 (jessie) 
Home    : https://savannah.gnu.org/projects/datamash/ (pm)

msort   sort records in complex ways (man)
Path    : /usr/bin/msort
Version : 8.53
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Repo    : Debian 8.8 (jessie) 
Home    : http://www.billposer.org/Software/msort.html (pm)

Best wishes ... cheers, drl

drl

View Public Profile for drl

Find all posts by drl

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Sort the fields in a comma delimited file

Hi, I have a comma delimited file. I want to sort the fields alphabetically and again store them in a comma delimited file. For example, My file looks like this. abc,aaa,xyz,xxx,def pqr,ggg,eee,iii,qqq zyx,lmo,pqr,abc,fff and I want my output to look like this, all fields sorted...

2. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,...

3. Shell Programming and Scripting

Large pipe delimited file that I need to add CR/LF every n fields

I have a large flat file with variable length fields that are pipe delimited. The file has no new line or CR/LF characters to indicate a new record. I need to parse the file and after some number of fields, I need to insert a CR/LF to start the next record. Input file ...

4. Shell Programming and Scripting

Rearrange the text file

Gents, I have a large file and each line of the file contains more than 200 bytes.Please let me a way to have the new line to start when the word "FIT" appears. I was trialling with 'tr' command but i am not sure how to get it based on bytes and so it wasn't working... Current...

5. UNIX for Advanced & Expert Users

Problem while counting number of fields in TAB delimited file

I'm facing a strange problem, please help me out. Here we go. I want to count number of fields in particular file. filename and delimiter character will be passed through parameter. On command prompt if i type following i get 27 as output (which is correct) cat customer.dat | head -1 | awk...

6. Shell Programming and Scripting

Print records which do not have expected number of fields in a comma delimited file

Hi, I have a comma (,) delimited file, in which few fields are enclosed with in double quotes " ". I have to print the records in the file which donot have expected number of field with the line number. File1 ==== name,desgnation,doj,project #header#...

7. Shell Programming and Scripting

Split a free form text delimited by space to words with other fields

Hi, I need your help for below with shell scripting or perl I/P key, Sentence customer1, I am David customer2, I am Taylor O/P Key, Words Customer1,I Customer1,am Customer1,David Customer2,I Customer2,am Customer2,Taylor

8. Shell Programming and Scripting

Using awk to rearrange fields

Hi, I am required to arrange columns of a file i.e make the 15th column into the 1st column. I am doing awk 'begin {fs=ofs=","} {print $15,$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14}' ad.data>ad.csv the problem is that column 15 gets to column 1 but it is not comma separated with the...

9. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as...

10. Shell Programming and Scripting

Pattern Match and Rearrange the Fields in UNIX

For an Output like below Input : <Subject A="I" B="1039502" C="2015-06-30" D="010101010101"> Output : <Subject D="010101010101" B="1039502" C="2015-06-30" A="I"> I have been using something like below but not getting the desired output : awk -F ' ' '/Subject/ BEGIN{OFS=" ";}...

LEARN ABOUT OPENSOLARIS

xml2po

XML2PO(1)																 XML2PO(1)

NAME

       xml2po - program to create a PO-template file from a DocBook XML file and merge it back into a (translated) XML file

SYNOPSIS

       xml2po [OPTIONS] [XMLFILE]

DESCRIPTION

       This manual page documents briefly the xml2po command.

       xml2po is a simple Python program which extracts translatable content from free-form XML documents and outputs gettext compatible POT
       files. Translated PO files can be turned into XML output again.

       It can work it's magic with most "simple" tags, and for complicated tags one has to provide a list of all tags which are "final" (that will
       be put into one "message" in PO file), "ignored" (skipped over) and "space preserving".

OPTIONS

       The program follows the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included
       below.

       -a, --automatic-tags
	   Automatically decides if tags are to be considered "final" or not.

       -k, --keep-entities
	   Don't expand entities (default). See also the -e option.

       -e, --expand-all-entities
	   Expand all entities (including SYSTEM ones).

       -m, --mode=TYPE
	   Treat tags as type TYPE (default: docbook).

       -o, --output=FILE
	   Print resulting text (XML while merging translations with "-p" or "-t" options, POT template file while extracting strings, and
	   translated PO file with "-r" option) to the given FILE.

       -p, --po-file=FILE
	   Specify a PO FILE containing translation and output XML document with translations merged in. Using this option will overwrite the
	   temporary file .xml2po.mo.

       -r, --reuse=FILE
	   Specify a translated XML document in FILE with the same structure to generate translated PO file for XML document given on command
	   line.

       -t, --translation=FILE
	   Specify a MO file containing translation and output XML document with translations merged in.

       -u, --update-translation=LANG.po
	   Updates a PO file using msgmerge.

       -l, --language=LANG
	   Explicitely set language of the translation.

       -h, --help
	   Show summary of options.

       -v, --version
	   Show version of program.

EXAMPLES

   Creating POT template files
       To create a POT template book.pot from an input file book.xml, which consists of chapter1.xml and chapter2.xml (external entities), run:

			       /usr/bin/xml2po -o book.pot book.xml chapter1.xml chapter2.xml

       To expand entities use the -e option:

			       /usr/bin/xml2po -e -o book.pot book.xml

   Creating translated XML files (merging back PO files)
       After translating book.pot into LANG.po, merge the translations back by using -p option for each XML file:

			       /usr/bin/xml2po -p LANG.po -o book.LANG.xml book.xml
			       /usr/bin/xml2po -p LANG.po -o chapter1.LANG.xml chapter1.xml
			       /usr/bin/xml2po -p LANG.po -o chapter2.LANG.xml chapter2.xml

       If you used the -e option to expand entities, you should use it again to merge back the translation into an XML file:

			       /usr/bin/xml2po -e -p LANG.po -o book.LANG.xml book.xml

   Updating PO files
       When base XML file changes, the real advantages of PO files come to surface. There are 2 ways to merge the translation. The first is to
       produce a new POT template file (additionally use the -e if you decided earlier to expand entities). Afterwards run msgmerge to merge the
       translation with the new POT file:

			       /usr/bin/msgmerge -o tmp.po LANG.po book.pot

       Now rename tmp.po to LANG.po and update your translation. Alternatively, xml2po provides the -u option, which does exactly these two steps
       for you. The advantage is, that it also runs msgfmt to give you a statistical output of translation status (count of translated,
       untranslated and fuzzy messages). Additionally use the -e if you decided earlier to expand entities:

			       /usr/bin/xml2po -u LANG.po book.xml

SEE ALSO

       msgmerge (1), msgfmt (1)

AUTHOR

       This manual page was written by Daniel Leidert <daniel.leidert@wgdd.de> for the Debian system (but may be used by others). Permission is
       granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version
       published by the Free Software Foundation.

COPYRIGHT

       Copyright (C) 2005 Daniel Leidert

								    2005/02/10								 XML2PO(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Sort the fields in a comma delimited file

Discussion started by: swethapatil

2. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

Discussion started by: axo959

3. Shell Programming and Scripting

Large pipe delimited file that I need to add CR/LF every n fields

Discussion started by: clintrpeterson

4. Shell Programming and Scripting

Rearrange the text file

Discussion started by: appu2176