Extract substring specif position and length from file line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract substring specif position and length from file line
# 1  
Old 03-15-2013
Scissors Extract substring specif position and length from file line

Hi gurus,
I am trying to figure out how to extract substring from file line (all lines in file), as specified position and specified legth.

Example input (file lines)
Code:
dhaskjdsa dsadhkjsa dhsakjdsad hsadkjh
dsahjdksahdsad sahkjd sahdkjsahd sajkdh adhjsak

I want to extract substring on position 10, length 4

Example output:
Code:
 dsad
hdsa

I am figuring out how to do that with sed, grep etc, with no luck.
Any help appreciated.

Last edited by Scrutinizer; 03-15-2013 at 08:39 PM.. Reason: code tags
# 2  
Old 03-15-2013
try cut
# 3  
Old 03-15-2013
Another option is using bash sub-string expansion:

Syntax:
Code:
${parameter:offset}
${parameter:offset:length}

E.g.
Code:
#!/bin/bash
while read line
do
   echo ${line:10:4}
done < filename

Check bash manual for further reference:
Code:
man bash

# 4  
Old 03-15-2013
Quote:
Originally Posted by Yoda
Code:
#!/bin/bash
while read line
do
   echo ${line:10:4}
done < filename

No offense intended, but that's a terrible solution. There are quite a few bugs in that short script.

First of all, we don't know anything about the data, so we can't make any assumptions.

If there is leading or trailing whitespace, the field splitting done by read will discard them. This will affect the results of the parameter expansion, yielding characters that begin later in the line than desired, and/or we could miss characters at the end of the substring if they were discarded whitespace.

The read does backslash escaping. If there are backslashes in the data, again, an incorrect substring is the result.

If the correct substring is extracted, it could still fail to print properly if it looks to echo like a valid option or valid escape sequences.

What if there's an asterisk, a question mark, or a bracketed expression? Those may trigger pathname expansion (aka file globbing) since the parameter expansion is unquoted.

Troublesome sample data:
Code:
1234567890-n a
     678901234
\2\4\6\8\01234
1234567890* *?

If you wanted to do this correctly with bash builtins and parameter expansion, the following is the way:
Code:
while IFS= read -r line; do
    printf '%s\n' "${line:10:4}"
done < filename

Ygor's suggestion is probably simplest and best.

Regards,
Alister

Last edited by alister; 03-15-2013 at 03:16 PM..
These 2 Users Gave Thanks to alister For This Post:
# 5  
Old 03-15-2013
No offence taken. It is my bad that I really didn't consider scenarios like backslash escaping & globbing. I really appreciate your feedback.

Thank you! Smilie
# 6  
Old 03-15-2013
Leading and trailing spaces are not preserved in read, but internal ones are OK:
Code:
$ echo ' a b   c d '|while read l ;do echo ">$l<";done
>a b   c d<
$
 
 
sed is happy to do this:
 
$ sed 's/^.\{9\}\(.\{4\}\).*/\1/' in_file > out_file


Last edited by DGPickett; 03-15-2013 at 05:15 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a substring from a file

Hello, A question please. A have a file that contains a string. Ex: AAAABBCCCCCDDEEEEEEEEEEFF I'd want to recover 2 substrings, 'BB' and 'FF' and then leave them in a new file. From position 5, 2 caracters (ex:"BB") and from position 25, 2 caracters (ex:"FF") in a file. Could anoyone help me... (3 Replies)
Discussion started by: nolo41
3 Replies

2. Shell Programming and Scripting

Extract substring in a file

Hello, A question please. A have a file that contains a string. Ex: AAAABBCCCCCDDEEEEEEEEEEFF I'd want to recover 2 substrings, 'BB' and 'FF' and then leave them in a new file. Could anoyone help me please? Thanks in advance (3 Replies)
Discussion started by: nolo41
3 Replies

3. UNIX for Dummies Questions & Answers

Help with awk, where line length and field position are variable

I have several questions about using awk. I'm hoping someone could lend me a hand. (I'm also hoping that my questions make sense.) I have a file that contains pipe separated data. Each line has similar data but the number of fields and the field position on each line is variable. ... (3 Replies)
Discussion started by: Cheese64
3 Replies

4. Shell Programming and Scripting

Add substring in a file containing fixed length record.

I am new to awk and writing a script using awk. I have file containing fixed length records, I wish to extract 2 substring(each substring is padded with zeros on left e.g 000000003623) and add each substring respectively for every record in the file to get total sum of respective substring for all... (5 Replies)
Discussion started by: Devesh5683
5 Replies

5. UNIX for Dummies Questions & Answers

To Extract words from File based on Position

Hi Guys, While I was writing one shell script , I just got struck at this point. I need to extract words from a file at some specified position and do some comparison operation and need to replace the extracted word with another word. Eg : I like Orange very much. I need to replace... (19 Replies)
Discussion started by: kuttu123
19 Replies

6. UNIX for Dummies Questions & Answers

Using sed to extract a substring at end of line

This is the line that I am using: sed 's/^*\({3}*$\)/\1 /' <test.txt >results.txt and suppose that test.txt contains the following lines: http://www.example.com/200904/AUS.txt http://www.example.com/200903/_RUS.txt http://www.example.com/200902/.FRA.txt What I expected to see in results.txt... (6 Replies)
Discussion started by: figaro
6 Replies

7. UNIX for Dummies Questions & Answers

Extract substring of unknown length from string

I have a string: hgLogOutput=" +0000 files: forum/web/hook-test.txt /forum/web/hook-test-2.txt description: test" and I want to extract the file names from it, they will always appear between the files: and the description:. I have worked out that I can do this: "$hgLogOutput" | awk '{... (2 Replies)
Discussion started by: klogger
2 Replies

8. Shell Programming and Scripting

Deleting Characters at specific position in a line if the line is certain length

I've got a file that would have lines similar to: 12345678 x.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 23456781 x.00 xx.00 xx.00 xx.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 34567812 x.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 45678123 x.00 xx.00 xx.00 xx.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 xx.00... (10 Replies)
Discussion started by: Cailet
10 Replies

9. UNIX for Dummies Questions & Answers

how to extract a substring froma file

hi all, I'm really newbie on this and I need some help. how is the best way to extract a strig or substring from a each line in a file. e.g. I want to print only this ERROR=JUD+the followed numbers from one line like this one, considering the numbers change related to different errors ... (1 Reply)
Discussion started by: morena
1 Replies
Login or Register to Ask a Question