Sponsored Content
Top Forums Shell Programming and Scripting Extract large list of substrings Post 302233883 by cfajohnson on Monday 8th of September 2008 05:34:12 PM
Old 09-08-2008
Quote:
Originally Posted by dcfargo
I have a very long string (millions of characters).

Where do you have it? Is it in a file? In a variable?

Are there any newlines in the string?
Quote:
I have a file with start location and length that is thousands of rows long:

Start Length
5 10
16 21
44 100
215 37
...

I'd like to extract the substring that corresponds to the start and length from each row of the list:

I tried just using a large awk '{print substr($1,5,10), "\n", substr($1,16,21) "\n", substr($1,44,100) "\n", substr($1,215,37)...}' infile > outfile &

command

But it seems to hang likely because the Bash line is too long.

I have no problem extracting portions of a multimegabyte string using bash's parameter expansion:
Code:
## Assuming the string is in 'infile'
string=$( < infile )
while read start length
do
  printf "%s\n" "${string:$start:$length}"
done < /path/to/file/with/startpoints_and_lengths

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

List large files

Hi I need to list all files in the system: 1. Greater than specific size 2. All files sorted by size How can I do that? Thanks in advance. (2 Replies)
Discussion started by: GNMIKE
2 Replies

2. Shell Programming and Scripting

Need to extract 7 characters immediately after text '19' from a large file.

Hi All!! I have a large file containing millions of record. My purpose is to extract 7 characters immediately after text '19' from this file (including text '19') and save the result in new file. So, my OUTPUT would be as under : 191234561 194567894 192789005 198839408 and so on..... ... (7 Replies)
Discussion started by: parshant_bvcoe
7 Replies

3. Shell Programming and Scripting

Extract data from large file 80+ million records

Hello, I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file. What will be the besat and fastest way to extract the ne file. sample file format :--... (2 Replies)
Discussion started by: learner16s
2 Replies

4. Shell Programming and Scripting

extract unique pattern from large text file

Hi All, I am trying to extract data from a large text file , I want to extract lines which contains a five digit number followed by a hyphen , like 12345- , i tried with egrep ,eg : egrep "+" text.txt but which returns all the lines which contains any number of digits followed by hyhen ,... (19 Replies)
Discussion started by: shijujoe
19 Replies

5. Shell Programming and Scripting

Extract information into large variable

Hello people :) That's here my first message to your forum, so I guess it would be fine ^^ I have a request about a code I want to use. Actually, my system use a large variable, including much informations but those informations can change by more and I want to extract one of thoses... (26 Replies)
Discussion started by: WolwX
26 Replies

6. Shell Programming and Scripting

Extract three substrings from a logfile

I have a log file like below. 66.249.73.11 - - "UCiZ7QocVqYAABgwfP8AAHAA" "US" "Mediapartners-Google" "-" www.mahashwetha.com.sg "GET... (2 Replies)
Discussion started by: Tuxidow
2 Replies

7. Shell Programming and Scripting

Curl download zip extract large xml file

Hi i have a php script that works 100% however i don't want this to run on php because of server limits etc. Ideally if i could convert this simple php script to a shell script i can set it up to run on a cron. My mac server has curl on it. So i am assuming i should be using this to download the... (3 Replies)
Discussion started by: timgolding
3 Replies

8. UNIX for Dummies Questions & Answers

Extract spread columns from large file

Dear all, I want to extract around 300 columns from a very large file with almost 2million columns. There are no headers, but I can find out which column numbers I want. I know I can extract with the function 'cut -f2' for example just the second column but how do I do this for such a large... (1 Reply)
Discussion started by: fndijk
1 Replies

9. Shell Programming and Scripting

Need to extract 8 characters from a large file.

Hi All!! I have a large file containing millions of records. My purpose is to extract 8 characters immediately from the given file. 222222222|ZRF|2008.pdf|2008|01/29/2009|001|B|C|C 222222222|ZRF|2009.pdf|2009|01/29/2010|001|B|C|C 222222222|ZRF|2010.pdf|2010|01/29/2011|001|B|C|C... (5 Replies)
Discussion started by: pavand
5 Replies

10. UNIX for Beginners Questions & Answers

Command to extract empty field in a large UNIX file?

Hi All, I have records in unix file like below. In this file, we have empty fields from 4th Column to 22nd Column. I have some 200000 records in a file. I want to extract records only which have empty fields from 4th field to 22nd filed. This file is comma separated file. what is the unix... (2 Replies)
Discussion started by: rakeshp
2 Replies
STRSPN(3)								 1								 STRSPN(3)

strspn - Finds the length of the initial segment of a string consisting entirely of characters contained within a given mask.

SYNOPSIS
int strspn (string $subject, string $mask, [int $start], [int $length]) DESCRIPTION
Finds the length of the initial segment of $subject that contains only characters from $mask. If $start and $length are omitted, then all of $subject will be examined. If they are included, then the effect will be the same as call- ing strspn(substr($subject, $start, $length), $mask) (see "substr" for more information). The line of code: <?php $var = strspn("42 is the answer to the 128th question.", "1234567890"); ?> 2 to $var, because the string "42" is the initial segment of $subject that consists only of characters contained within "1234567890". PARAMETERS
o $subject - The string to examine. o $mask - The list of allowable characters. o $start - The position in $subject to start searching. If $start is given and is non-negative, then strspn(3) will begin examining $sub- ject at the $start'th position. For instance, in the string ' abcdef', the character at position 0 is ' a', the character at posi- tion 2 is ' c', and so forth. If $start is given and is negative, then strspn(3) will begin examining $subject at the $start'th position from the end of $subject. o $length - The length of the segment from $subject to examine. If $length is given and is non-negative, then $subject will be examined for $length characters after the starting position. If $length is given and is negative, then $subject will be examined from the starting position up to $length characters from the end of $subject. RETURN VALUES
Returns the length of the initial segment of $subject which consists entirely of characters in $mask. EXAMPLES
Example #1 strspn(3) example <?php // subject does not start with any characters from mask var_dump(strspn("foo", "o")); // examine two characters from subject starting at offset 1 var_dump(strspn("foo", "o", 1, 2)); // examine one character from subject starting at offset 1 var_dump(strspn("foo", "o", 1, 1)); ?> The above example will output: int(0) int(2) int(1) NOTES
Note This function is binary-safe. SEE ALSO
strcspn(3). PHP Documentation Group STRSPN(3)
All times are GMT -4. The time now is 03:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy