Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Convert a fixed width file to a delimited file Post 303037094 by wisecracker on Tuesday 23rd of July 2019 06:52:52 AM
Old 07-23-2019
The phrase is important!
"""any given character delimited"""
I read this as ANY character of ASCII, UNICODE or binary!

An assumption that the delimiter is any printable ASCII character was not even in my thoughts.
So until the OP clarifies what "any given character delimited" is then we have to assume the "0x00" and "0x80 to 0xFF" characters might be wanted, removed or replaced.
Code:
Last login: Tue Jul 23 11:22:27 on ttys000
AMIGA:amiga~> text="A delimited"$'\x80'"text file."$'\x80'
AMIGA:amiga~> byte=$'\x80'
AMIGA:amiga~> sed 's/\'"${byte}"'\+/&\n/g' <<< "${text}"
sed: 1: "s/\?\+/&\n/g": RE error: illegal byte sequence
AMIGA:amiga~> tr -s ${byte} ' ' <<< "${text}"
tr: Illegal byte sequence
AMIGA:amiga~> _


Last edited by wisecracker; 07-23-2019 at 09:20 AM.. Reason: Remove typo and add "or replaced"
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting a Delimited File to Fixed width file

Hi, I have a delimited file generated by a database and i need to convert it to fixed width file using the field length of the database. Can any body suggest me how can i proceed with it? :confused: Thanks Raghavan (2 Replies)
Discussion started by: raghavan.aero
2 Replies

2. Shell Programming and Scripting

how to convert Fixed length file to delimited file.

I have below fixed lenth file . I have to convert this to delimitted file. File1.txtE116005/29/19930E001E000 E12201/23/19940E001E003 E10406/4/19940E001E003 I want to convert this to : E116,0,05/29/1993,0,E001,E000 E122,0,1/23/1994,0,E001,E003 E104,0,6/4/1994,0,E001,E003 I have a... (7 Replies)
Discussion started by: satyam_sat
7 Replies

3. UNIX for Dummies Questions & Answers

Convert a tab delimited/variable length file to fixed length file

Hi, all. I need to convert a file tab delimited/variable length file in AIX to a fixed lenght file delimited by spaces. This is the input file: 10200002<tab>US$ COM<tab>16/12/2008<tab>2,3775<tab>2,3783 19300978<tab>EURO<tab>16/12/2008<tab>3,28523<tab>3,28657 And this is the expected... (2 Replies)
Discussion started by: Everton_Silveir
2 Replies

4. Shell Programming and Scripting

Changing one column of delimited file column to fixed width column

Hi, Iam new to unix. I have one input file . Input file : ID1~Name1~Place1 ID2~Name2~Place2 ID3~Name3~Place3 I need output such that only first column should change to fixed width column of 15 characters of length. Output File: ID1<<12 spaces>>Name1~Place1 ID2<<12... (5 Replies)
Discussion started by: manneni prakash
5 Replies

5. UNIX for Dummies Questions & Answers

cleaning up spaces from fixed width file while converting to csv file

Open to a sed/awk/or perl alternative so that i can stick command into my bash script. This is a problem I resolve using a combination of cut commands - but that is getting convoluted. So would really appreciate it if someone could provide a better solution which basically replaces all... (3 Replies)
Discussion started by: svn
3 Replies

6. UNIX for Dummies Questions & Answers

Delete header row and reformat from tab delimited to fixed width

Hello gurus, I have a file in a tab delimited format and a header row. I need a code to delete the header in the file, and convert the file to a fixed width format, with all the columns aligned. Below is a sample of the file:... (4 Replies)
Discussion started by: chumsky
4 Replies

7. UNIX for Dummies Questions & Answers

convert # delimited text file to fixed width

Hello gurus, I have a file containing 5 columns delimited by '#' as shown in the example below: HRP1000-PLVAR#HRP1000-OTYPE#HRP1000-OBJID#HRP1000-BEGDA#HRP1000-ENDDA# 99991231#AU7129#000000000#1 PROCTER & GAMBLE# 99991231#TT4283#1000013883#21111 LAUNDRY# 99991231#TT4283#1000013884#21121 DISH... (3 Replies)
Discussion started by: chumsky
3 Replies

8. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

9. Shell Programming and Scripting

Remove new line character and add space to convert into fixed width file

I have a file with different record length. The file as to be converted into fixed length by appending spaces at the end of record. The length should be calculated based on the record with maximum length in the file. If the length is less than the max length, the spaces should be appended... (4 Replies)
Discussion started by: Amrutha24
4 Replies

10. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies
Perl::Critic::Policy::RegularExpressions::ProhibitCompleUsereContributed Perl DPerl::Critic::Policy::RegularExpressions::ProhibitComplexRegexes(3)

NAME
Perl::Critic::Policy::RegularExpressions::ProhibitComplexRegexes - Split long regexps into smaller "qr//" chunks. AFFILIATION
This Policy is part of the core Perl::Critic distribution. DESCRIPTION
Big regexps are hard to read, perhaps even the hardest part of Perl. A good practice to write digestible chunks of regexp and put them together. This policy flags any regexp that is longer than "N" characters, where "N" is a configurable value that defaults to 60. If the regexp uses the "x" flag, then the length is computed after parsing out any comments or whitespace. Unfortunately the use of descriptive (and therefore longish) variable names can cause regexps to be in violation of this policy, so interpolated variables are counted as 4 characters no matter how long their names actually are. CASE STUDY
As an example, look at the regexp used to match email addresses in Email::Valid::Loose (tweaked lightly to wrap for POD) (?x-ism:(?:[^(40)<>@,;:".\[]00-37x80-xff]+(?![^(40)<>@,;:".\[] 00-37x80-xff])|"[^\x80-xff 15"]*(?:\[^x80-xff][^\x80-xff 15 "]*)*")(?:(?:[^(40)<>@,;:".\[]00-37x80-xff]+(?![^(40)<>@,;:".\[ ]00-37x80-xff])|"[^\x80-xff 15"]*(?:\[^x80-xff][^\x80-xff 15"]*)*")|.)*@(?:[^(40)<>@,;:".\[]00-37x80-xff]+(?![^(40)<>@, ;:".\[]00-37x80-xff])|[(?:[^\x80-xff 15[]]|\[^x80-xff])*] )(?:.(?:[^(40)<>@,;:".\[]00-37x80-xff]+(?![^(40)<>@,;:".\[]00 -37x80-xff])|[(?:[^\x80-xff 15[]]|\[^x80-xff])*]))*) which is constructed from the following code: my $esc = '\\'; my $period = '.'; my $space = '40'; my $open_br = '['; my $close_br = ']'; my $nonASCII = 'x80-xff'; my $ctrl = '00-37'; my $cr_list = ' 15'; my $qtext = qq/[^$esc$nonASCII$cr_list"]/; # " my $dtext = qq/[^$esc$nonASCII$cr_list$open_br$close_br]/; my $quoted_pair = qq<$esc>.qq<[^$nonASCII]>; my $atom_char = qq/[^($space)<>@,;:".$esc$open_br$close_br$ctrl$nonASCII]/;# " my $atom = qq<$atom_char+(?!$atom_char)>; my $quoted_str = qq<"$qtext*(?:$quoted_pair$qtext*)*">; # " my $word = qq<(?:$atom|$quoted_str)>; my $domain_ref = $atom; my $domain_lit = qq<$open_br(?:$dtext|$quoted_pair)*$close_br>; my $sub_domain = qq<(?:$domain_ref|$domain_lit)>; my $domain = qq<$sub_domain(?:$period$sub_domain)*>; my $local_part = qq<$word(?:$word|$period)*>; # This part is modified $Addr_spec_re = qr<$local_part@$domain>; If you read the code from bottom to top, it is quite readable. And, you can even see the one violation of RFC822 that Tatsuhiko Miyagawa deliberately put into Email::Valid::Loose to allow periods. Look for the "|." in the upper regexp to see that same deviation. One could certainly argue that the top regexp could be re-written more legibly with "m//x" and comments. But the bottom version is self- documenting and, for example, doesn't repeat "x80-xff" 18 times. Furthermore, it's much easier to compare the second version against the source BNF grammar in RFC 822 to judge whether the implementation is sound even before running tests. CONFIGURATION
This policy allows regexps up to "N" characters long, where "N" defaults to 60. You can override this to set it to a different number with the "max_characters" setting. To do this, put entries in a .perlcriticrc file like this: [RegularExpressions::ProhibitComplexRegexes] max_characters = 40 CREDITS
Initial development of this policy was supported by a grant from the Perl Foundation. AUTHOR
Chris Dolan <cdolan@cpan.org> COPYRIGHT
Copyright (c) 2007-2011 Chris Dolan. Many rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of this license can be found in the LICENSE file included with this module perl v5.16.3 2014-06-09 Perl::Critic::Policy::RegularExpressions::ProhibitComplexRegexes(3)
All times are GMT -4. The time now is 06:32 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy