Script extract text from txt file with grep


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script extract text from txt file with grep
# 1  
Old 09-02-2014
Script extract text from txt file with grep

All,

I require a script that grabs some text from the gitHub API and will grep (or other function) for a string a characters that starts with (") quotes followed by two letters, may contain a pipe |, and ending with ) . What i have so far is below but it's not returning anything.


Code:
#!/bin/bash
IFS_bak=$IFS
IFS=$'\r\n'
uid='xxxxxxxxxxxxx'
GH_OAUTH='xxxxxxxxxxxxxxxxxxxxxxxxxxx'
curl https://raw.githubusercontent.com/SomeOrganization/Infrastructure/SomeRepository/SomeContainer/SSHKeyserverList.sh?access_token=$GH_OAUTH > somefile.txt
cat somefile.txt | grep '^\"[a-z][a-z]\|\\)' > outputfile.txt

Example text reply from the github API:
Code:
"SomeUser1" ) # Some Name (Some@emailDomain.com)
            NewSSHKey="ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx some@email.com"
            ;;
       "SomeUser2" ) # Some Name (Some@emailDomain.com)
            NewSSHKey="ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx some@email.com"
            ;;
         "SomeUser3" | "alternateUserName3 ) # Some Name (Some@emailDomain.com)
            NewSSHKey="ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx some@email.com"
                ;;

In summary, what i need is this line:

"UserName" ) # Users Name (Users@e-mailDomain.com)
OR in case of multiple user names i need
"UserName" | SecondaryUserName ) # Users Name (Users@e-mailDomain.com).

In my code above i'm trying to 'grep' for the string that starts with " and eds with ) and may contain a pipe (but may not contain a pipe). I would like the script to print all of the lines containing that string to a text file, is what i have on the right path or am i way off base here?

Thanks for reading,

Choco
# 2  
Old 09-02-2014
The main two issues with your grep is that you are looking for a match at the beginning of the line (using ^). There is only one line which has data at the very beginning in your example text, which is the first line. Also you are only looking for two lower case characters.

If your requirement is essentially, look for a " then at least two characters, followed by all sorts of data (which may include a pipe) fololwed by a ) , then you can simply change your grep to:

Code:
grep '\"[a-zA-Z][a-zA-Z].*)'

Also just to add, you don't need the cat in there, you can simply pass the file to the grep command so the last line in your script becomes:

Code:
grep '\"[a-zA-Z][a-zA-Z].*)' somefile.txt > outputfile.txt

This User Gave Thanks to pilnet101 For This Post:
# 3  
Old 09-02-2014
For the sample data shown, you could simply use:
Code:
grep '^ *"' somefile.txt > outputfile.txt

Note that the second name in:
Code:
         "SomeUser3" | "alternateUserName3 ) # Some Name (Some@emailDomain.com)

is missing a closing double quote character.
This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 09-02-2014
Quote:
Originally Posted by Don Cragun
For the sample data shown, you could simply use:
Code:
grep '^ *"' somefile.txt > outputfile.txt

Note that the second name in:
Code:
         "SomeUser3" | "alternateUserName3 ) # Some Name (Some@emailDomain.com)

is missing a closing double quote character.
Don,

While working through Pil's solution i thought the same thing and tried it. Strange thing is when i used this shortened version the grep would skip some lines and ended up pulling some bad lines out of the header. I ended up with this:

Code:
grep '\".*)*#' somefile.txt

Thank you for the reply.

Choco

---------- Post updated at 06:21 PM ---------- Previous update was at 06:20 PM ----------

Thank you sir! this worked straight away but included some information from the file header. I used it as a template to do this:

Code:
grep '\".*)*#' somefile.txt

and it works! Thanks again!
# 5  
Old 09-02-2014
Quote:
Originally Posted by ChocoTaco
Don,

While working through Pil's solution i thought the same thing and tried it. Strange thing is when i used this shortened version the grep would skip some lines and ended up pulling some bad lines out of the header. I ended up with this:

Code:
grep '\".*)*#' somefile.txt

Thank you for the reply.

Choco

---------- Post updated at 06:21 PM ---------- Previous update was at 06:20 PM ----------

Thank you sir! this worked straight away but included some information from the file header. I used it as a template to do this:

Code:
grep '\".*)*#' somefile.txt

and it works! Thanks again!
Close. All this RE is matching is a " followed by zero or more other characters followed by zero or more ) followed by a #. I assume you meant to require a ) to be found and followed by zero or more spaces. That would be:
Code:
grep '".*) *#' somefile.txt

If my earlier suggestion didn't match all of the lines you wanted, is it possible that you have some leading <tab> characters instead of spaces on some of the lines you wanted to match? The following would take care of that:
Code:
grep '^[[:space:]]*"' somefile.txt

but, it wouldn't keep it from matching header lines that you didn't want.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract information from txt file

Hello! I need help :) I have a file like this: AA BC FG RF TT GH DD FF HH (a few number of rows and three columns) and I want to put the letters of each column in a variable step by step in order to give them as input in another script. So I would like to obtain: for the 1° loop:... (11 Replies)
Discussion started by: edekP
11 Replies

2. Windows & DOS: Issues & Discussions

2 Questions: replace text in txt file, add text to end of txt file

so... Lets assume I have a text file. The text file contains multiple "#" symbols. I want to replace all thos "#"s with a STRING using DOS/Batch I want to add a certain TEXT to the end of each line. How can I do this WITHOUT aid of sed, grep or anything linux related ? (1 Reply)
Discussion started by: pasc
1 Replies

3. Shell Programming and Scripting

Perl script to extract text from image file

Hi Folks, Could you please share your ideas on extracting text from image file(jpg,png and gif formats). Regards, J (1 Reply)
Discussion started by: scriptscript
1 Replies

4. Shell Programming and Scripting

Command to extract all columns except the last few from a txt file

hello, i have publicly available txt file with little less than 300000 rows. i want to extract from column 1 to column 218 and save it in another text file. i use the cut command but the file is saved with multiple rows from the source file onto a single row in the destination. basically it is... (6 Replies)
Discussion started by: madrazzii
6 Replies

5. Shell Programming and Scripting

regular expression with shell script to extract data out of a text file

hi i am trying to extract some specific data out of a text file using regular expressions with shell script that is using a multiline grep .. and the tool i am using is pcregrep so that i can get compatibility with perl's regular expressions for a sample data like this, i am trying to grab... (6 Replies)
Discussion started by: vemkiran
6 Replies

6. Shell Programming and Scripting

Need help in writing a script that do internal grep on a log file..and extract acct no's from it..

I need to write a script, which takes the input a log file and create output file with acct no's line by line from selected records with text like (in red) : 18:51:18 | 217863|Acct 0110855565|RC 17608| 16 Subs| 1596 UsgRecs| 2 Secs| 430 CPUms| prmis2:26213 <MoveUsage d aemon needs to run... (7 Replies)
Discussion started by: rkrish
7 Replies

7. Shell Programming and Scripting

Need help please with Grep/Sed command to extract text and numbers from a file

Hello All, I need to extract lines from a file that contains ALPHANUMERIC and the length of Alphanumeric is set to 16. I have pasted the sample of the lines from the text file that I have created. My problem is that sometimes 16 appears in other part of the line. I'm only interested to... (14 Replies)
Discussion started by: mnassiri
14 Replies

8. UNIX for Dummies Questions & Answers

Extract numbers from .txt file

I need to extract all the p-value numbers and the rho numbers from a .txt file and write them as coma separated values in a new file. Ideally I would get two files in the end, one for p- values and one for rho. Any suggestions? I appreciate your help!!! The .txt file looks essentially like this... (5 Replies)
Discussion started by: eggali
5 Replies

9. Shell Programming and Scripting

Extract from txt file

I have data as follow in the txt file. I want to skip line starting with '#' sign. #command program abc defmt exp refmt ... ... I want to store abc exp .... in a array. I want to store defmt refmt in a array I need command to read each line in the file. I need... (6 Replies)
Discussion started by: ekb
6 Replies

10. Shell Programming and Scripting

Script for removing text from a txt file

Hello, So I wanted to write a very simple script to remove some information from a text file and save it as something else. For example I have a text file (let's call it txt) with three rows of numbers: 0 0 1 9 8 7 5 0 6 7 9 0 0 7 9 8 1 1 6 4 0 6 0 0 9 8 4 6 0 9 2 8 1 And I want to... (2 Replies)
Discussion started by: hertingm
2 Replies
Login or Register to Ask a Question