Today (Saturday) We will make some minor tuning adjustments to MySQL.

You may experience 2 up to 10 seconds "glitch time" when we restart MySQL. We expect to make these adjustments around 1AM Eastern Daylight Saving Time (EDT) US.


Working with strings (awk, sed, scripting, etc...)


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Working with strings (awk, sed, scripting, etc...)

Hi evrybody
For those who are bored I suggest exercise for tail Smilie
There is "csv" string:
Code:
A,B,C,D,E,G

Desired output:
Code:
| (A) A | (A,B) B | (A,B,C) C | (A,B,C,D) D | (A,B,C,D,E) E | G

There are no whitespace characters at the beginning and end of the line.
# 2  
Hi, thanks for the puzzle Smilie

Code:
awk '{s=x; for(i=1; i<=NF-1; i++) {s=s (s?FS:"(") $i; $i=s ") " $i " " }}{sub(" $",x); print x,$0}' FS=, OFS='| ' file

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Well, not sure if this is the most elegant solution:

Code:
awk '
    {TMP = $NF
     for (i=NF; i>1; i--)    {sub ("," $i, _)
                              TMP = (i==2?"(":_) $0 (i==2?")":_) " " $(i-1) " | " TMP
                             }
     gsub (/([^ ],[^ ]*)+/, "(&)", TMP)
     print "| "  TMP
    }
' FS=,  file
 | (A) A | (A,B) B | (A,B,C) C | (A,B,C,D) D | (A,B,C,D,E) E | G

EDIT: or, a bit more straightforward,



Code:
awk -F, '{for (i=1; i<NF; i++) {TMP = TMP DL $i; DL = FS; printf "| (%s) %s ", TMP , $i}; print "| " $NF}' file
 | (A) A | (A,B) B | (A,B,C) C | (A,B,C,D) D | (A,B,C,D,E) E | G




EDIT: revisiting the first proposal, it can be simplified somewhat:


Code:
awk '
        {TMP = $NF
         for (i=NF; i>1; i--)   {sub ("," $i, _)
                                 TMP = "(" $0 ") " $(i-1) " | " TMP
                                }
         print "| "  TMP
        }
' FS=,  file


Last edited by RudiC; 05-26-2019 at 05:13 AM..
This User Gave Thanks to RudiC For This Post:
# 4  
@Scrutinizer, @RudiC Thank you for the good examples. I think the last one is optimal. I also have variants for parsing the line is not in length but in width on AWK with using RS="," and not a complicated version on SED. I will share my efforts after a while.

Last edited by nezabudka; 05-25-2019 at 04:49 PM..
# 5  
One might also try:
Code:
awk -F, '
NF {	for(i = 1; i < NF; i++) {
		printf("| (%s", $1)
		for(j = 2; j <= i; j++)
			printf("%s%s", FS, $j)
		printf(") %s ", $i)
	}
	print "| " $NF
}' file

This uses a little more verbose approach to the problem, but produces the same output as Scrutinizer's suggestion except for input lines containing no fields. My code won't give any output for empty input lines; Scrutinizer's code will produce an output line containing a vertical bar, a space, and a newline character for an empty input line.

If you want the output his code produces in that case, my code will do that if you remove the first occurrence of NF in my code. If you don't want the output his code produces n that case, his code will get rid of that line if you change the {sub in his code to NF{sub.
This User Gave Thanks to Don Cragun For This Post:
# 6  
Hello to all. Thanks again for participating.
After the post @Don_Cragun I added explanations to each example
Not very elegant but it just works
Code:
#!/bin/bash
:<<SPRAVKA
It works only with a single line.
it is possible with spaces.
SPRAVKA

while read -d, P; do
        T=$T$d$P
        echo -n "$t| ($T) $P"
        d=,
        t=" "
done < file
echo -n " | "
grep -o '.$' file

It's simple and don't even need to use "hold spase".
This works with each line separately.
Code:
sed -rn 's/.$/ | &/; :1;s/^(\S*)(.),/\1 | (\1\2) \2/;t1;s/^ //p' file

Here in my opinion there is elegance but I doubt the effectiveness of the work
This works with all strings as if one ends with a comma excluding the last
Code:
awk 'RT {T = T (T?RS:"(") $1; printf "| " T ") " $1 FS; next} {print "| "$1}' RS=, file


Last edited by nezabudka; 05-26-2019 at 06:04 AM..
# 7  
Quote:
Originally Posted by nezabudka
[..]Here in my opinion there is elegance but I doubt the effectiveness of the work
This works with all strings as if one ends with a comma excluding the last
Code:
awk 'RT {T = T (T?RS:"(") $1; printf "| " T ") " $1 FS; next} {print "| "$1}' RS=, file

Nice use of RT, however RT is GNU awk only.

Here is another option using RS=, which should work with any POSIX awk:
Code:
awk 'p{printf "| (%s) %s ",s p,p; s=s p RS } {p=$1} END{print "| " p}' RS=, file

Of course, these options only work with single lines, otherwise we would need to add NR%c conditions

Last edited by Scrutinizer; 05-26-2019 at 07:47 AM..
This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
awk or sed or grep filter a line and/or between strings
bayupw
Hi, I have multiple files on a directory with the following content: blahblah blahblah hostname server1 blahblah blahblah ---BEGIN--- aaa bbb ccc ddd ---END--- blahblah blahblah blahblah I would like to filter all the files with awk or sed or something else so I can get below...... Shell Programming and Scripting
6
Shell Programming and Scripting
How to replace the complex strings from a file using sed or awk?
Badhrish
Dear All, I am having a requirement to find the difference between 2 files and generate a discrepancy report out of it as an html page. I prefer using diff -y file1 file2 since it gives user friendly layout to know any discrepancy in the record and unique records among the 2 file. Here's how it...... Programming
12
Programming
awk or sed script to remove strings
aix_admin_007
Below am trying to separate FA-7A:1, In output file it should display 7A 1 Command am using Gives same output as below format: 22B7 10000000c9720873 0 22B7 10000000c95d5d8b 0 22BB 10000000c97843a2 0 22BB 10000000c975adbd 0 Not showing FA ports as required format...... Shell Programming and Scripting
5
Shell Programming and Scripting
Replace Strings with sed or awk
Kingbruce
Hello i need some help with the usage of sed. Situation : 2 textfiles, file.in , file.out In the first textfile which is called file.in are the words for the substitution. Every word is in a new-line like : Firstsub Secondsub Thridsub ... In the second textflie wich is called file.out is...... Shell Programming and Scripting
5
Shell Programming and Scripting
Using awk/sed to extract text between Strings
tintin72
Dear Unix Gurus, I've got a data file with a few hundred lines (see truncated sample)... BEGIN_SCAN1 TASK_NAME=LA48 PDD Profiles PROGRAM=ArrayScan 1.00 21.220E+00 2.00 21.280E+00 END_DATA END_SCAN1 BEGIN_SCAN2 TASK_NAME=LA48 PDD Profiles 194.00 2.1870E+00 ...... UNIX for Dummies Questions & Answers
5
UNIX for Dummies Questions & Answers

Featured Tech Videos