Sponsored Content
Top Forums Shell Programming and Scripting Rearrange fields of delimited text file Post 303003131 by drl on Thursday 7th of September 2017 09:36:29 PM
Old 09-07-2017
Hi.

Making lots of assumptions about the input data, here is a solution that transposes the file, sorts it (in a hybrid manner), and re-transposes:
Code:
#!/usr/bin/env bash

# @(#) s1       Demonstrate sort headers, carrying data fields, datamash, msort

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C dixf datamash msort

FILE=${1-data1}
E=expected-output.txt

pl " Input data file $FILE:"
head $FILE

pl " Expected output:"
head $E

# See f3 and f2 for intermediate output.
pl " Results:"
datamash -t ';' transpose < $FILE |
tee f3 |
msort -q -l -n 1,1 -d ';' --comparison-type hybrid |
tee f2 |
datamash -t ';' transpose |
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C || ( pe; pe " Results cannot be verified." ) >&2

pl " Some detail for datamash, msort:"
dixf datamash msort

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.8 (jessie) 
bash GNU bash 4.3.30
dixf (local) 1.50
datamash (GNU datamash) 1.0.6
msort 8.53

-----
 Input data file data1:
a_13;a_2;a_1;a_10
13;2;1;10

-----
 Expected output:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Results:
a_1;a_2;a_10;a_13
1;2;10;13

-----
 Verify results if possible:

-----
 Comparison of 2 created lines with 2 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output.txt have same content.

-----
 Some detail for datamash, msort:

datamash        command-line calculations (man)
Path    : /usr/bin/datamash
Version : 1.0.6
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Help    : probably available with -h,--help
Repo    : Debian 8.8 (jessie) 
Home    : https://savannah.gnu.org/projects/datamash/ (pm)

msort   sort records in complex ways (man)
Path    : /usr/bin/msort
Version : 8.53
Type    : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Repo    : Debian 8.8 (jessie) 
Home    : http://www.billposer.org/Software/msort.html (pm)

Best wishes ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Sort the fields in a comma delimited file

Hi, I have a comma delimited file. I want to sort the fields alphabetically and again store them in a comma delimited file. For example, My file looks like this. abc,aaa,xyz,xxx,def pqr,ggg,eee,iii,qqq zyx,lmo,pqr,abc,fff and I want my output to look like this, all fields sorted... (3 Replies)
Discussion started by: swethapatil
3 Replies

2. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

3. Shell Programming and Scripting

Large pipe delimited file that I need to add CR/LF every n fields

I have a large flat file with variable length fields that are pipe delimited. The file has no new line or CR/LF characters to indicate a new record. I need to parse the file and after some number of fields, I need to insert a CR/LF to start the next record. Input file ... (2 Replies)
Discussion started by: clintrpeterson
2 Replies

4. Shell Programming and Scripting

Rearrange the text file

Gents, I have a large file and each line of the file contains more than 200 bytes.Please let me a way to have the new line to start when the word "FIT" appears. I was trialling with 'tr' command but i am not sure how to get it based on bytes and so it wasn't working... Current... (3 Replies)
Discussion started by: appu2176
3 Replies

5. UNIX for Advanced & Expert Users

Problem while counting number of fields in TAB delimited file

I'm facing a strange problem, please help me out. Here we go. I want to count number of fields in particular file. filename and delimiter character will be passed through parameter. On command prompt if i type following i get 27 as output (which is correct) cat customer.dat | head -1 | awk... (12 Replies)
Discussion started by: vikanna
12 Replies

6. Shell Programming and Scripting

Print records which do not have expected number of fields in a comma delimited file

Hi, I have a comma (,) delimited file, in which few fields are enclosed with in double quotes " ". I have to print the records in the file which donot have expected number of field with the line number. File1 ==== name,desgnation,doj,project #header#... (7 Replies)
Discussion started by: machomaddy
7 Replies

7. Shell Programming and Scripting

Split a free form text delimited by space to words with other fields

Hi, I need your help for below with shell scripting or perl I/P key, Sentence customer1, I am David customer2, I am Taylor O/P Key, Words Customer1,I Customer1,am Customer1,David Customer2,I Customer2,am Customer2,Taylor (4 Replies)
Discussion started by: monishathampi
4 Replies

8. Shell Programming and Scripting

Using awk to rearrange fields

Hi, I am required to arrange columns of a file i.e make the 15th column into the 1st column. I am doing awk 'begin {fs=ofs=","} {print $15,$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14}' ad.data>ad.csv the problem is that column 15 gets to column 1 but it is not comma separated with the... (10 Replies)
Discussion started by: seddoubt
10 Replies

9. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

10. Shell Programming and Scripting

Pattern Match and Rearrange the Fields in UNIX

For an Output like below Input : <Subject A="I" B="1039502" C="2015-06-30" D="010101010101"> Output : <Subject D="010101010101" B="1039502" C="2015-06-30" A="I"> I have been using something like below but not getting the desired output : awk -F ' ' '/Subject/ BEGIN{OFS=" ";}... (19 Replies)
Discussion started by: arunkesi
19 Replies
XML2PO(1)																 XML2PO(1)

NAME
xml2po - program to create a PO-template file from a DocBook XML file and merge it back into a (translated) XML file SYNOPSIS
xml2po [OPTIONS] [XMLFILE] DESCRIPTION
This manual page documents briefly the xml2po command. xml2po is a simple Python program which extracts translatable content from free-form XML documents and outputs gettext compatible POT files. Translated PO files can be turned into XML output again. It can work it's magic with most "simple" tags, and for complicated tags one has to provide a list of all tags which are "final" (that will be put into one "message" in PO file), "ignored" (skipped over) and "space preserving". OPTIONS
The program follows the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. -a, --automatic-tags Automatically decides if tags are to be considered "final" or not. -k, --keep-entities Don't expand entities (default). See also the -e option. -e, --expand-all-entities Expand all entities (including SYSTEM ones). -m, --mode=TYPE Treat tags as type TYPE (default: docbook). -o, --output=FILE Print resulting text (XML while merging translations with "-p" or "-t" options, POT template file while extracting strings, and translated PO file with "-r" option) to the given FILE. -p, --po-file=FILE Specify a PO FILE containing translation and output XML document with translations merged in. Using this option will overwrite the temporary file .xml2po.mo. -r, --reuse=FILE Specify a translated XML document in FILE with the same structure to generate translated PO file for XML document given on command line. -t, --translation=FILE Specify a MO file containing translation and output XML document with translations merged in. -u, --update-translation=LANG.po Updates a PO file using msgmerge. -l, --language=LANG Explicitely set language of the translation. -h, --help Show summary of options. -v, --version Show version of program. EXAMPLES
Creating POT template files To create a POT template book.pot from an input file book.xml, which consists of chapter1.xml and chapter2.xml (external entities), run: /usr/bin/xml2po -o book.pot book.xml chapter1.xml chapter2.xml To expand entities use the -e option: /usr/bin/xml2po -e -o book.pot book.xml Creating translated XML files (merging back PO files) After translating book.pot into LANG.po, merge the translations back by using -p option for each XML file: /usr/bin/xml2po -p LANG.po -o book.LANG.xml book.xml /usr/bin/xml2po -p LANG.po -o chapter1.LANG.xml chapter1.xml /usr/bin/xml2po -p LANG.po -o chapter2.LANG.xml chapter2.xml If you used the -e option to expand entities, you should use it again to merge back the translation into an XML file: /usr/bin/xml2po -e -p LANG.po -o book.LANG.xml book.xml Updating PO files When base XML file changes, the real advantages of PO files come to surface. There are 2 ways to merge the translation. The first is to produce a new POT template file (additionally use the -e if you decided earlier to expand entities). Afterwards run msgmerge to merge the translation with the new POT file: /usr/bin/msgmerge -o tmp.po LANG.po book.pot Now rename tmp.po to LANG.po and update your translation. Alternatively, xml2po provides the -u option, which does exactly these two steps for you. The advantage is, that it also runs msgfmt to give you a statistical output of translation status (count of translated, untranslated and fuzzy messages). Additionally use the -e if you decided earlier to expand entities: /usr/bin/xml2po -u LANG.po book.xml SEE ALSO
msgmerge (1), msgfmt (1) AUTHOR
This manual page was written by Daniel Leidert <daniel.leidert@wgdd.de> for the Debian system (but may be used by others). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by the Free Software Foundation. COPYRIGHT
Copyright (C) 2005 Daniel Leidert 2005/02/10 XML2PO(1)
All times are GMT -4. The time now is 06:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy