09-09-2008
Quote:
Originally Posted by
gbalsu
Dear all
I have a large file w. ~ 10 million lines.
The first two cols have matching partners.
For example:
A A
A B
B B
or
A A
B A
B B
The matches may be separated by an unknown number of lines.
My intention is to group them and add a "group" value in the last col.
For example
A A A
A B A
B B A
or
A A A
B A A
B B A
How do you determine the group value? Why is the third line not B B B?
Quote:
Rest assured that only one of A B and B A will be present and not both.
Any help will be highly appreciated.
A may have matches in addition to B and any number of of them. But in all cases I would like to name the group with the first partner of the first instance, i.e. A in this case.
It would be helpful if you provided more examples from the file.
It might also help if you posted some real data in addition to the abbreviated, single-letter data.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi
Fields in Files 1,2,3,4 are pipe"|" separated.
Say I want to grep
col1 from File1
col3 from File2
col4 from File3
and print to File4 in the following order:
col3|col1|col4
what is the best way of doing this?
Thanks (2 Replies)
Discussion started by: vbshuru
2 Replies
2. Shell Programming and Scripting
Hi,
i want to print(f) the content of a file, but i don't know how many columns it has (i.e. it changes from each time my script is run). The number of columns is constant throughout the file.
Any suggestions? (8 Replies)
Discussion started by: bistru
8 Replies
3. Shell Programming and Scripting
Hi
I have a requirement wherein the file is comma separated. Each records seems to have different number of columns, how I can detect like a row index wise, how many columns are present ?
Thanks in advance. (2 Replies)
Discussion started by: videsh77
2 Replies
4. Shell Programming and Scripting
Dear All,
I am a newbie to shell scripting so this one is really over my head.
I have a text file with five fields as below:
76576.867188 6232.454102 2.008904 55.000000 3
76576.867188 6232.454102 3.607231 55.000000 4
76576.867188 6232.454102 1.555146 65.000000 3
76576.867188 6232.454102... (19 Replies)
Discussion started by: Ghetz
19 Replies
5. Programming
I'm working with an extremely outdated and old system at work. We do not have ncurses, but we do have curses. I need to make a user interface for users connecting with xterm. One issue I've encountered is if the user resizes the window, I'd like to provide functionality to redraw the screen with... (4 Replies)
Discussion started by: nwboy74
4 Replies
6. Shell Programming and Scripting
Dear all, could you please help me with awk please?
I have such input:
Input:
a d
b e
c f
The number of lines is unknown before reading the file.
I need to print possible combination between the two columns like this:
Output:
a d
b d
c d
a e
b e
c e
a f (2 Replies)
Discussion started by: irrevocabile
2 Replies
7. Shell Programming and Scripting
I am a new user of Unix/Linux, so this question might be a bit simple!
I am trying to join two (very large) files that both have different # of cols and rows in each file.
I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows.... (1 Reply)
Discussion started by: BNasir
1 Replies
8. Shell Programming and Scripting
Hi all,
I have two files, chap.txt and complex.txt.
chap.txt looks like this:
a
d
l
m
r
k
complex.txt looks like this:
a c d e l m n j
a d l p q r
c p r m
......... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies
9. Shell Programming and Scripting
Hello,
I want to compute the bitwise number of matches in pairwise fashion for all columns. The problem is I have 18486955 rows and 750 columns. Please help with code, I believe this will take a lot of time, is there a way of tracking progress?
Input
Org1 Org2 Org3
A A T
A ... (9 Replies)
Discussion started by: ritakadm
9 Replies
10. Shell Programming and Scripting
I recently had to remove a number of columns from a sorted copy of a file, but couldn't get the cut command to take fields out, just what to keep. This is the only thing I could find as an example, but could it be simplified?
tstamp=`date +%H%M%S`
grep -v "T$" filename |egrep -v "^$" |sort... (3 Replies)
Discussion started by: wbport
3 Replies
LEARN ABOUT DEBIAN
shell-quote
SHELL-QUOTE(1p) User Contributed Perl Documentation SHELL-QUOTE(1p)
NAME
shell-quote - quote arguments for safe use, unmodified in a shell command
SYNOPSIS
shell-quote [switch]... arg...
DESCRIPTION
shell-quote lets you pass arbitrary strings through the shell so that they won't be changed by the shell. This lets you process commands
or files with embedded white space or shell globbing characters safely. Here are a few examples.
EXAMPLES
ssh preserving args
When running a remote command with ssh, ssh doesn't preserve the separate arguments it receives. It just joins them with spaces and
passes them to "$SHELL -c". This doesn't work as intended:
ssh host touch 'hi there' # fails
It creates 2 files, hi and there. Instead, do this:
cmd=`shell-quote touch 'hi there'`
ssh host "$cmd"
This gives you just 1 file, hi there.
process find output
It's not ordinarily possible to process an arbitrary list of files output by find with a shell script. Anything you put in $IFS to
split up the output could legitimately be in a file's name. Here's how you can do it using shell-quote:
eval set -- `find -type f -print0 | xargs -0 shell-quote --`
debug shell scripts
shell-quote is better than echo for debugging shell scripts.
debug() {
[ -z "$debug" ] || shell-quote "debug:" "$@"
}
With echo you can't tell the difference between "debug 'foo bar'" and "debug foo bar", but with shell-quote you can.
save a command for later
shell-quote can be used to build up a shell command to run later. Say you want the user to be able to give you switches for a command
you're going to run. If you don't want the switches to be re-evaluated by the shell (which is usually a good idea, else there are
things the user can't pass through), you can do something like this:
user_switches=
while [ $# != 0 ]
do
case x$1 in
x--pass-through)
[ $# -gt 1 ] || die "need an argument for $1"
user_switches="$user_switches "`shell-quote -- "$2"`
shift;;
# process other switches
esac
shift
done
# later
eval "shell-quote some-command $user_switches my args"
OPTIONS
--debug
Turn debugging on.
--help
Show the usage message and die.
--version
Show the version number and exit.
AVAILABILITY
The code is licensed under the GNU GPL. Check http://www.argon.org/~roderick/ or CPAN for updated versions.
AUTHOR
Roderick Schertler <roderick@argon.org>
perl v5.8.4 2005-05-03 SHELL-QUOTE(1p)