Merging strings that have identical rownames in a dataframe
Hi
I have a data frame with repeated names in column 1, and different descriptors in column 2. I want to merge/cat strings that have same entry in column 1 into one row with any separator.
I am looking to replace two or more strings on different lines using sed, but not with the same variable. IE
# cat xxx.file
<abc>
abc def ghi
abc def ghi
abc def ghi
currently I can only change each line with the same pattern:
# sed -e '/<abc>/!s/abc\(.*\)/jkl mno/' xxx.file
abc jkl mno... (3 Replies)
I have a sorted file like:
Apple 3
Apple 5
Apple 8
Banana 2
Banana 3
Grape 31
Orange 7
Orange 13
I'd like to search $1 and if $1 is not the same as $1 in the previous row print that row and print the number of times $1 was found.
so the output would look like:
Apple 8 3
Banana... (2 Replies)
Hi. I'm hoping that someone can help me with a bash script to delete a block of lines from a file.
What I want to do is delete every line between two stings that are the same,
including the line the first string is on but not the second.
(Marked lines to match with !)
For example if I... (2 Replies)
i have a problem in finding block of identical strings...i solved the problem in finding consecutive identical words and now i want to expand the code in order to find and remove consecutive identical block of strings...
for example the awk code removing consecutive identical word is:... (2 Replies)
i have a problem in finding block of identical strings...i solved the problem in finding consecutive identical words and now i want to expand the code in order to find and remove consecutive identical block of strings...
for example the awk code removing consecutive identical word is:... (2 Replies)
i have a problem in finding block of identical strings...i solved the problem in finding consecutive identical words and now i want to expand the code in order to find and remove consecutive identical block of strings...
for example the awk code removing consecutive identical word is:... (2 Replies)
Seems not very post about R language. Here is one: How to grep a sublist of a list like grep -f in unix? say I have a dataframe
ID v1 v2 v3
A 1 3 4
B 4 5 6
C 7 8 9
D 1 3 4
E 1 3 3
F 2 4 5 and I only need
ID v1 v2 v3
A 1 3 4
C 7 8 9
E 1 3 3
F 2 4 5 by like
grep... (2 Replies)
Dear all,
I need a little help. I am working on a frequency driven database in which the structure is as under:
headword=gloss<space>Frequency
The data which I am working with has dupes i.e. the Headword is repeated more than once with a different gloss variant on the right hand side and... (8 Replies)
hey,
i m having a hard time trying to print only the first occurrence between 2 idenicale strings.
for the following output:
please
help
me im a
noob
please
im a noob
help me
noob
please
help
me im a
noob
please
im a noob
help me
noob (3 Replies)
Hello all,
I need to filter a dataframe composed of several columns of data to remove the duplicates according to one of the columns. I did it with pandas. In the main time, I need that the last column that contains all different data ( not redundant) is conserved in the output like this:
A ... (5 Replies)
Discussion started by: pedro88
5 Replies
LEARN ABOUT REDHAT
psc
PSC(1) General Commands Manual PSC(1)NAME
psc - prepare sc files
SYNOPSIS
psc [-fLkrSPv] [-s cell] [-R n] [-C n] [-n n] [-d c]
DESCRIPTION
Psc is used to prepare data for input to the spreadsheet calculator sc(1). It accepts normal ascii data on standard input. Standard out-
put is a sc file. With no options, psc starts the spreadsheet in cell A0. Strings are right justified. All data on a line is entered on
the same row; new input lines cause the output row number to increment by one. The default delimiters are tab and space. The column for-
mats are set to one larger than the number of columns required to hold the largest value in the column.
OPTIONS -f Omit column width calculations. This option is for preparing data to be merged with an existing spreadsheet. If the option is not
specified, the column widths calculated for the data read by psc will override those already set in the existing spreadsheet.
-L Left justify strings.
-k Keep all delimiters. This option causes the output cell to change on each new delimiter encountered in the input stream. The
default action is to condense multiple delimiters to one, so that the cell only changes once per input data item.
-r Output the data by row first then column. For input consisting of a single column, this option will result in output of one row
with multiple columns instead of a single column spreadsheet.
-s cell
Start the top left corner of the spreadsheet in cell. For example, -s B33 will arrange the output data so that the spreadsheet
starts in column B, row 33.
-R n Increment by n on each new output row.
-C n Increment by n on each new output column.
-n n Output n rows before advancing to the next column. This option is used when the input is arranged in a single column and the
spreadsheet is to have multiple columns, each of which is to be length n.
-d c Use the single character c as the delimiter between input fields.
-P Plain numbers only. A field is a number only when there is no imbedded [-+eE].
-S All numbers are strings.
-v Print the version of psc
SEE ALSO sc(1)AUTHOR
Robert Bond
PSC 7.16 19 September 2002 PSC(1)