Sponsored Content
Top Forums Shell Programming and Scripting How to use grep in a loop using a bash script? Post 302976350 by aberg on Tuesday 28th of June 2016 12:53:38 PM
Old 06-28-2016
How to use grep in a loop using a bash script?

Dear all,

Please help with the following.

I have a file, let's call it data.txt, that has 3 columns and approx 700,000 lines, and looks like this:

Code:
rs1234  A  C
rs1236  T  G
rs2345  G  T

Moderator's Comments:
Mod Comment Please use code tags as required by forum rules!


I have a second file, called reference.txt, which has one column with about 500,000 lines, and contains some, but not all of the values of column 1 in data.txt. e.g.

Code:
rs1234
rs2345
...

I want to 'grep' out all the lines in data.txt that have a match in reference.txt, so that I end with:

Code:
rs1234  A  C
rs2345  G  T

I have tried:
Code:
cat data.txt | grep -f reference.txt > output.txt

But this was taking far too long.

I therefore thought I might need to loop it using a bash script. I had a go, but got nowhere with the following:

Code:
for i in reference.txt; do
grep "$i" data.txt
done

I am sure that this must be quite simple to do, but would be grateful for your help with this.

Thank you,

AB

Last edited by RudiC; 06-29-2016 at 06:26 AM.. Reason: Added code tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

loop does not execute in bash script?

I have a very basic bash shell script, which has many "while... done; for .... done" loop clauses, like the following ~~ #!/bin/bash while blablalba; do .... done < /tmp/file for line in `cat blablabla`; do grep $line /tmp/raw ; done > /tmp/1; while blablalba2; do .... done <... (2 Replies)
Discussion started by: fedora
2 Replies

2. Shell Programming and Scripting

error in bash script 'if' loop

SEND_MESSAGE=test echo $SEND_MESSAGE if then echo `date` > update_dt_ccaps.lst echo "The file transfer failed" >> update_dt_ccaps.lst SEND_MESSAGE=false fi The above code is showing error in bash shell as : ./test: line 5: [: test: integer expression expected ... (2 Replies)
Discussion started by: DILEEP410
2 Replies

3. Shell Programming and Scripting

Getting error on for loop - bash script

Hi, I am working on bash script after a long time. I am getting error near done statement while running a for loop snippet. The error says "Syntax error near unexpcted token 'done'" please suggest what could be wrong. here is the snippet elements=${#option_arr} //an array of values... (1 Reply)
Discussion started by: arundhati_s
1 Replies

4. Shell Programming and Scripting

(BASH) Using a loop variable to grep something in a file?

Hi, I have a loop running until a variable L that is read previously in the full script. I'd like to grep some information in an input file at a line that contains the value of the loop parameter $i. I've tried to use grep, but the problem is nothing is written in the FILE files. It seems grep... (5 Replies)
Discussion started by: DMini
5 Replies

5. UNIX for Dummies Questions & Answers

Problem with multiple grep in bash loop

Hello, I am trying to create a matrix of 0's and 1's depending on whether a gene and sample name are found in the same line in a file called results.txt. An example of the results.txt file is (tab-delimited): Sample1 Gene1 ## Gene2 ## Sample2 Gene2 ## Gene 4 ## Sample3 Gene3 ... (2 Replies)
Discussion started by: InfoSeeker2
2 Replies

6. Shell Programming and Scripting

Expect script called in loop from Bash Script

Having issues with an expect script. I've been scripting bash, python, etc... for a couple years now, but just started to try and use Expect. Trying to create a script that takes in some arguments, and then for now, just runs a pwd command(for testing, final will be command I pass). Here is... (0 Replies)
Discussion started by: cbo0485
0 Replies

7. Shell Programming and Scripting

Bash script - loop question

Hi Folks, I have a loop that goes through an array and the output is funky. sample: array=( 19.239.211.30 ) for i in "${array}" do echo $i iperf -c $i -P 10 -x CSV -f b -t 50 | awk 'END{print '$i',$6}' >> $file done Output: 19.239.211.30 19.2390.2110.3 8746886 seems that when... (2 Replies)
Discussion started by: nitrohuffer2001
2 Replies

8. Shell Programming and Scripting

While loop with input in a bash script

I have the following while loop that I put in a script, demo.sh: while read rna; do aawork=$(echo "${rna}" | sed -n -e 's/\(...\)\1 /gp' | sed -f rna.sed) echo "$aawork" | sed 's/ //g' echo "$aawork" | tr ' ' '\012' | sort | sed '/^$/d' | uniq -c | sed 's/*\(*\) \(.*\)/\2: \... (3 Replies)
Discussion started by: faizlo
3 Replies

9. UNIX for Beginners Questions & Answers

Help with date in bash script for loop from YYYYMMDDHHMM

Hi everyone I need some help I want to create an script which does some processing it takes the two arguments 201901010000 and 201901020200 - so YYYMMDDHHMM I want to split processing into hours from start until end, I dont get why this works but when I add to a future variable... (1 Reply)
Discussion started by: kl1ngac1k
1 Replies

10. UNIX for Beginners Questions & Answers

Help with a bash loop script

Create a single bash script that does the following: a. Print out the number of occurrences for each motif that is found in the bacterial genome and output to a file called motif_count.txt b. Create a fasta file for each motif (so 3 in total) which contains all of the genes and their... (6 Replies)
Discussion started by: dre
6 Replies
PMDABASH(1)						      General Commands Manual						       PMDABASH(1)

NAME
pmdabash - Bourne-Again SHell trace performance metrics domain agent SYNOPSIS
$PCP_PMDAS_DIR/bash/pmdabash [-C] [-d domain] [-l logfile] [-I interval] [-t timeout] [-U username] configfile DESCRIPTION
pmdabash is an experimental Performance Metrics Domain Agent (PMDA) which exports "xtrace" events from a traced bash(1) process. This includes the command execution information that would usually be sent to standard error with the set -x option to the shell. Event metrics are exported showing each command executed, the function name and line number in the script, and a timestamp. Additionally, the process identifier for the shell and its parent process are exported. This requires bash version 4 or later. A brief description of the pmdabash command line options follows: -d It is absolutely crucial that the performance metrics domain number specified here is unique and consistent. That is, domain should be different for every PMDA on the one host, and the same domain number should be used for the same PMDA on all hosts. -l Location of the log file. By default, a log file named bash.log is written in the current directory of pmcd(1) when pmdabash is started, i.e. $PCP_LOG_DIR/pmcd. If the log file cannot be created or is not writable, output is written to the standard error instead. -s Amount of time (in seconds) between subsequent evaluations of the shell trace file descriptor(s). The default is 2 seconds. -m Maximum amount of memory to be allowed for each event queue (one per traced process). The default is 2 megabytes. -U User account under which to run the agent. The default is the unprivileged "pcp" account in current versions of PCP, but in older versions the superuser account ("root") was used by default. INSTALLATION
In order for a host to export the names, help text and values for the bash performance metrics, do the following as root: # cd $PCP_PMDAS_DIR/bash # ./Install As soon as an instrumented shell script (see INSTRUMENTATION selection below) is run, with tracing enabled, new metric values will appear - no further setup of the agent is required. If you want to undo the installation, do the following as root: # cd $PCP_PMDAS_DIR/bash # ./Remove pmdabash is launched by pmcd(1) and should never be executed directly. The Install and Remove scripts notify pmcd(1) when the agent is installed or removed. INSTRUMENTATION
In order to allow the flow of event data between a bash(1) script and pmdabash, the script should take the following actions: #!/bin/sh source $PCP_DIR/etc/pcp.sh pcp_trace on $@ # enable tracing echo "awoke, $count" pcp_trace off # disable tracing The tracing can be enabled and disabled any number of times by the script. On successful installation of the agent, several metrics will be available: $ pminfo bash bash.xtrace.numclients bash.xtrace.maxmem bash.xtrace.queuemem bash.xtrace.count bash.xtrace.records bash.xtrace.parameters.pid bash.xtrace.parameters.parent bash.xtrace.parameters.lineno bash.xtrace.parameters.function bash.xtrace.parameters.command When an instrumented script is running, the generation of event records can be verified using the pmevent(1) command, as follows: $ pmevent -t 1 -x '' bash.xtrace.records host: localhost samples: all bash.xtrace.records["4538 ./test-trace.sh 1 2 3"]: 5 event records 10:00:05.000 --- event record [0] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 43 bash.xtrace.parameters.command "true" 10:00:05.000 --- event record [1] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 45 bash.xtrace.parameters.command "(( count++ ))" 10:00:05.000 --- event record [2] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 46 bash.xtrace.parameters.command "echo 'awoke, 3'" 10:00:05.000 --- event record [3] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 47 bash.xtrace.parameters.command "tired 2" 10:00:05.000 --- event record [4] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 38 bash.xtrace.parameters.function "tired" bash.xtrace.parameters.command "sleep 2" FILES
$PCP_PMCDCONF_PATH command line options used to launch pmdabash $PCP_PMDAS_DIR/bash/help default help text file for the bash metrics $PCP_PMDAS_DIR/bash/Install installation script for the pmdabash agent $PCP_PMDAS_DIR/bash/Remove undo installation script for pmdabash $PCP_LOG_DIR/pmcd/bash.log default log file for error messages and other information from pmdabash PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The $PCP_CONF variable may be used to specify an alternative configura- tion file, as described in pcp.conf(5). SEE ALSO
bash(1), pmevent(1) and pmcd(1). Performance Co-Pilot PCP PMDABASH(1)
All times are GMT -4. The time now is 10:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy