![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Reg expression For | Harikrishna | Shell Programming and Scripting | 2 | 05-07-2008 02:40 AM |
| error: initializer expression list treated as compound expression | arunchaudhary19 | High Level Programming | 12 | 11-16-2007 06:44 AM |
| Help with Reg. Expression | moe2266 | UNIX for Dummies Questions & Answers | 7 | 07-16-2007 05:05 PM |
| OR expression | Rock | UNIX for Dummies Questions & Answers | 3 | 05-03-2007 09:50 AM |
| Regular Expression + Aritmetical Expression | Z0mby | Shell Programming and Scripting | 2 | 05-21-2002 11:59 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
KSH Expression
Quick question related to KSH expressions (not unix regular expressions). I am trying to craft a pattern that will correctly identify lines that match the following CSV text in a case statement: Code:
filename.txt, filename.txt, alpha, nnnn, nnnn, nnnn, Free form text Originally I simply used an expression like *,*,*,*,*,*,* in the following case statement: Code:
case ${LINE} in
# Expression 1..n are informational and specific enough that the
# expressions work well
expression 1..n)
... match expressions 1..n logic ... ;;
# CSV lines contain 7 fields and 6 commas
*,*,*,*,*,*,*)
... match valid CSV line logic ... ;;
# Malformed CSV lines or any other not matching my list of expressions
*)
... malformed CSV line or other mismatch ... ;;
esac
Problem: I found that the *,*,*,*,*,*,* CSV expression matches cases such as these: Code:
field1, field2, field3, field4, field5, field6, field7, field8, field9 field1, field2, field3, field4, field5, field6 field1, field2, field3, field4, field5, field6, field7,,,,,,, ,field1, field2, field3, field4, field5, field6, field7 I have tried numerous variations and have ended up with this expression: Code:
case ...
...
@(*)@(,)@(*) ) ...
...
esac
I can match more precisely and this nails the smallest CSV list of "text, text" but I still have to incorporate some comma counting logic that I don't want to include. The commas and/or asterisks are causing me complications with various expressions that I have tried (essentially * matches commas). Production code is very hard to change where I work once implemented so I'd like to nail down a very precise expression now and let the final *) expression trap all malformed lines. What am I doing wrong? By the way, I have no control of the data file provided me so changes to my data source won't happen. |
|
||||
|
Code:
file1.txt, file1_original_name.txt, control1, 1001, 100001, 10000, Data Sample 1 file2.txt, file2_original_name.txt, control5, 2001, 100002, 10000, Data Sample 2 file3.txt, file3_original_name.txt, control7, 3001, 100003, 20000, Data Sample 3 |
|
||||
|
Excellent. This gets me very close. Lines such as this still get through due to the final "*": Code:
file1.txt, file1_original_name.txt, control1, 1001, 100001, 10000, Data Sample 1,,,,, Altering the expression helps narrow it down: Code:
+([_a-zA-Z0-9]).txt,+( )+([_a-zA-Z0-9]).txt,+( )+([a-zA-Z0-9]),+( )+([0-9]),+( )+([0-9]),+( )+([0-9]),+( )+([ a-zA-Z0-9])) and I think this allows for anything but a comma in the final field: Code:
+([_a-zA-Z0-9]).txt,+( )+([_a-zA-Z0-9]).txt,+( )+([a-zA-Z0-9]),+( )+([0-9]),+( )+([0-9]),+( )+([0-9]),+( )+([!,]) I need to tweak the filenames a bit but I believe I have the basis for nailing down a very precise expression. Thanks for the help! Thomas |
![]() |
| Bookmarks |
| Tags |
| regex, regular expressions |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|