Cut -d Question


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Cut -d Question
# 1  
Old 03-19-2007
Cut -d Question

I went through quite a few threads and didn't find anything on this. I also looked on other sites and couldn't turn up an answer.

For completeness sake, I'm working off of solaris 10 in the korn shell environment.

I wrote a script for a buddy to help him out with the following issue.

He has a directory of files, here is an example of one of the files

verylongstringofmixedcharacters==-=-23480732.pdf

He wanted to write a script to remove everything from the "==-=-" and the numbers after it so the file would look like the following:

verylongstringofmixedcharacters.pdf

Utilizing the "cut -d= -f1" command and the "cut -d. -f2" command, I was able to pull off the "verylongstringofmixedcharacters" and the "pdf" part. I then set a new variable name to using the following line:
fileparts=`echo $filepart1'.'$filepart2`

That's not the whole script, but that's the piece where the new filename is created to what he desired. When I finished it and sent it off, he gave me the bad news that sometimes within the verylongstringofmixedcharacters there can be found an = sign. So it might be "verylong=stringof=mixedcharacters", thereby not allowing my first delimiter of = to work. My question to you all is the following:

Is there a way to have a multicharacter delimiter with cut? Meaning, could it be, "cut -d==-=- -f1"? I've tried it the following ways and I received an invalid delimiter messages:

cut -d==-=- -f1
cut -d"==-=-" -f1
cut -d'==-=-' -f1
cut -d "==-=-" -f1
cut -d '==-=-' -f1

I'm thinking I'll have to use a sed command of sorts to fix this. I'm looking into sed one liners that might help after I post this, but I figure I stimulate your brains with it. Thanks in advance for your help.
~Ryan
# 2  
Old 03-19-2007
You could set the Field Separator pattern in awk...
Code:
$ echo "verylongstringofmixedcharacters==-=-23480732.pdf"|awk 'BEGIN{FS="==-=-"}{print $1}'
verylongstringofmixedcharacters

You can even use multiple patterns...
Code:
$ echo "verylongstringofmixedcharacters==-=-23480732.pdf"|awk 'BEGIN{FS="(==-=-|\\.)";OFS="."}{print $1,$3}'
verylongstringofmixedcharacters.pdf

Or use sed...
Code:
$ echo "verylongstringofmixedcharacters==-=-23480732.pdf"|sed 's/\(.*\)==-=-.*\(\.pdf\)/\1\2/'
verylongstringofmixedcharacters.pdf

# 3  
Old 03-19-2007
Code:
for file in *==-=-* ; do
    mv "$file" "${file%%==-=-*}.${file##*.}"
done


Last edited by reborg; 03-19-2007 at 11:04 PM..
# 4  
Old 03-19-2007
Ygor, I'm very new to sed and have only started to realize the true potential of it. I hate to be a bother, but if you could explain how your command is interpreted, it would be greatly helpful:

sed 's/\(.*\)==-=-.*\(\.pdf\)/\1\2/'

Since posting the original post I was using the following string:

sed 's/==-=-*//g'

For some reason it doesn't recognize the * as the wildcard character, even if I put a \ before it.

Reborg, I'd like to note that not all of the files have a .pdf extension, there might be .xls or .txt or other varying types of files. I used pdf just as an example, sorry if I mislead folks there.

Thank you in advance for your help!

Last edited by Janus; 03-19-2007 at 10:13 PM..
# 5  
Old 03-19-2007
see edit above.

For sed, you are dealing with an RE ( regular expression ) , not a GLOB pattern.

* in RE means 0 or more of the preceding character, not any number of any character as in a glob pattern. The single character wildcard in RE is ., therefore ".*" corresponds ( more or less ) with * in a glob pattern.
# 6  
Old 03-19-2007
Quote:
Originally Posted by Janus
if you could explain how your command is interpreted, it would be greatly helpful:

sed 's/\(.*\)==-=-.*\(\.pdf\)/\1\2/'
Search:
Code:
\(     start of first bracketed pattern
.*     any number of characters
\)     end of first bracketed pattern
==-=-  literal
.*     any number of characters
\(     start of second bracketed pattern
\.pdf  literal ("." is escaped to prevent its special meaning)
\)     end of second bracketed pattern

Replacement:
Code:
\1     first bracketed pattern
\2     second bracketed pattern

# 7  
Old 03-19-2007
Quote:
Originally Posted by Ygor
Search:
Code:
\(     start of first bracketed pattern
.*     any number of characters
\)     end of first bracketed pattern
==-=-  literal
.*     any number of characters
\(     start of second bracketed pattern
\.pdf  literal ("." is escaped to prevent its special meaning)
\)     end of second bracketed pattern

Replacement:
Code:
\1     first bracketed pattern
\2     second bracketed pattern

Thank you, I see now how it all fits together. Seeing the actual meaning behind what each piece is doing helps out quite a bit! As stated in the replies above, .pdf isn't necessarily the only file extension this would have to look for. I made a slight change to it and it seems to work:

sed 's/\(.*\)==-=-.*\(\...\)/\1\2/'

I took pdf and replaced it with ..., the only question is, what if there is a greater number than 3 for the file extension? I don't think ... will work then, since it's going for literally 3 characters for that second part. What would be the correct expression to use to say "Any amount of character" and not just 3? Thanks again, both of ye for your input and help!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Question on cut

Korn Shell I have a file whose values are delimited using colon ( : ) $ cat test.txt hello:myde:temp:stiker $ cut -d: -f2,4 test.txt myde:stikerI want field 2 and field 4 to be returned but separated by a hyphen. The output should look like myde-stiker How can do this ? (without awk... (11 Replies)
Discussion started by: kraljic
11 Replies

2. Shell Programming and Scripting

A question on cut

hi, I used cut to get the I have a file f1 with content: 101.2 ms RTT from 3WHS 95.2 ms RTT from 3WHS 97.3 ms RTT from 3WHS 97.4 ms RTT from 3WHS 122.2 ms RTT from 3WHS 103.5 ms RTT from... (2 Replies)
Discussion started by: esolve
2 Replies

3. Shell Programming and Scripting

Simple Cut Question

I've got a file that contains a large list of links in this type of style: 'home_dir\2009\09\01\file.html' I'd like to cut off all of the characters left of 'file.html'. I tried: cat file.txt | cut -d\ -f4 but it told me that I had an invalid delimiter. So I tried: cat... (5 Replies)
Discussion started by: Rally_Point
5 Replies

4. UNIX for Dummies Questions & Answers

Question on the cut command

Suppose one has a file consisting of more than 2 columns and one has to extract a few columns from this file and swap some columns at the same time. Example: extract column 1, 2 and 4 from a file foo.csv and place them in the order 2, 4 and 1 into file foo.txt. I would be inclined to do this: cut... (4 Replies)
Discussion started by: figaro
4 Replies

5. UNIX for Dummies Questions & Answers

cut awk dummy question :)

how to make cut and awk treat "a b" as a single column rather then two separate columns "a and b"? how to remove " symbol from "a b" so there is only a b? Please help Regards Karol (14 Replies)
Discussion started by: sopel39
14 Replies

6. UNIX for Dummies Questions & Answers

Cut Question

Hi, I have created a variable abc within my script which can have values as follows abc = Ram,Iam or it can be abc = Uam or it can be abc = Sam,Tam,Pam Basically it can have a max of 3 values , seperated by comma. I want to assign these 3 values to 3 different variables In case of... (2 Replies)
Discussion started by: samit_9999
2 Replies

7. Shell Programming and Scripting

The cut command. Really simple question!

hi, sorry for asking what I am sure is a really easy question, I am wanting to cut the users real name from the output of 'finger'. $ cut -f2-3 filename is in my script but it only seems to cut the first line. I need to cut the 2nd and 3rd word from each line and store them in variables... (1 Reply)
Discussion started by: rorey_breaker
1 Replies

8. Shell Programming and Scripting

sort / cut question

Hi All, I have a small problem, hope you can help me out here. I have a file that contains the same format of lines in 99% of the cases. 906516 XYZ.NNN V 0000 20070711164648 userID1 userID2 hostname 20070711164641 There are unfortunately several lines with these... (5 Replies)
Discussion started by: BearCheese
5 Replies

9. Shell Programming and Scripting

SED and Cut question

I am trying to cut and delete using sed and redirect back into the file. This is not working write. When testing the script, it hangs. Any idea what I am doing wrong here. ################ Reads the input file to cut volumes for returns and CUT_ERVTAPE_FILE() { echo "working on cut... (2 Replies)
Discussion started by: gzs553
2 Replies

10. UNIX for Dummies Questions & Answers

cut question

#!/bin/bash echo "UserName PID Command" ps -ef > ps.temp grep '^\{2,3\}\{4\}' ps.temp > ps.temp2 cut -f1,2,8 ps.temp2 rm ps.temp* I am having some problems with the cut command. I only want to display the UID (field 1), PID(field 2), and Command(field 8). Right now the whole ps -ef... (5 Replies)
Discussion started by: knc9233
5 Replies
Login or Register to Ask a Question