Sponsored Content
Top Forums Shell Programming and Scripting Help with replace duplicate content Post 302584039 by cpp_beginner on Thursday 22nd of December 2011 03:15:20 AM
Old 12-22-2011
Help with replace duplicate content

Input file:
Code:
CCNI	data564_input1	264
CORO1A	data564_input2	155
ABC-B	data17_input1	3466
ABC-B	data17_input2	1133
ABC-B	data17_input3	2162
ABC-B	data17_input4	2019
HNRNPA2B1	data95_input1	101
HNRNPA2B1	data95_input2	340
IFITM1	data105_input2	291
IFITM2	data105_input1	505
MYL12A	data352_input2	212
MYL12B	data352_input1	131
MYL12B	data352_input3	76

Desired output file:
Code:
CCNI	data564_input1	264
CORO1A	data564_input2	155
ABC-B	data17_input1	3466
	data17_input2	1133
	data17_input3	2162
	data17_input4	2019
HNRNPA2B1	data95_input1	101
		data95_input2	340
IFITM1	data105_input2	291
IFITM2	data105_input1	505
MYL12A	data352_input2	212
MYL12B	data352_input1	131
	data352_input3	76

A tab delimiter "\t" is located in between each column.
I would like to replace the those duplicate content in column 1 with empty.
Thanks for any advice.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with remove duplicate content and only keep the first content detail

Input data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_21 SSA data_19 TYUEC data_14 TYUE data_15 SSA data_32 PEOCV . . Desired Output data_10 SSA data_2 TYUE data_3 PEOCV data_6 SSAT data_19 TYUEC (9 Replies)
Discussion started by: patrick87
9 Replies

2. Shell Programming and Scripting

Help with remove duplicate content

Input file data_1 10 US data_1 2 US data_1 5 UK data_2 20 ENGLAND data_2 12 KOREA data_3 4 CHINA . . data_60 123 US data_60 23 UK data_60 45 US Desired output file data_1 10 US data_1 5 UK data_2 20 ENGLAND data_2 12 KOREA (2 Replies)
Discussion started by: perl_beginner
2 Replies

3. Shell Programming and Scripting

Search duplicate field and replace one of them with new value

Dear All, I have file with 4 columns: 1 AA 0 21 2 BB 0 31 3 AA 0 21 4 CC 0 41 I would like to find the duplicate record based on column 2 and replace the 4th column of the duplicate by a new value. So, the output will be: 1 AA 0 21 2 BB 0 31 3 AA 0 -21 4 CC 0 41 Any suggestions... (3 Replies)
Discussion started by: ezhil01
3 Replies

4. Shell Programming and Scripting

Help with duplicate data content problem asking

Input file: A_69510335_ASD>aw 1199470 USA A_119571157_C>awe,QWEQE 113932840 USA C_34646666_qwe>TAWTT,G,TT 112736796 UK C_69510335_QW>T 1199470 USA D_70520237_WR>QEE,G 34459863 UK D_71380003_QWR>T 145418226 IK . Desired output: A_69510335_ASD>aw 1199470 USA... (1 Reply)
Discussion started by: perl_beginner
1 Replies

5. Shell Programming and Scripting

Replace duplicate columns with values from first occurrence

I've a text file with below values viz. multiple rows with same values in column 3, 4 and 5, which need to be considered as duplicates. For all such cases, the rows from second occurrence onwards should be modified in a way that their values in first two columns are replaced with values as in first... (4 Replies)
Discussion started by: asyed
4 Replies

6. Shell Programming and Scripting

Help with duplicate common data content

Input file: #data_131 0 >content..._* 1 >content..._at_+/97.20% #data_137 0 >content..._* 1 >content..._at_+/97.20% 2 >seq..._* 3 >content..._at_+/97.20% 4 >content..._at_+/97.20% #data_141 0 >content..._* #data_150 0 >content..._* 1 >content..._at_+/97.20% 2 >seq..._* 3... (3 Replies)
Discussion started by: perl_beginner
3 Replies

7. Shell Programming and Scripting

Sed: replace content from file with the content from file

Hi, I am having trouble while using 'sed' with reading files. Please help. I have 3 files. File A, file B and file C. I want to find content of file B in file A and replace it by content in file C. Thanks a lot!! Here is a sample of my question. e.g. (file A: a.txt; file B: b.txt; file... (3 Replies)
Discussion started by: dirkaulo
3 Replies

8. Shell Programming and Scripting

Remove the duplicate content in a file

Here is the contents of test.txt Dependencies Resolved Changes in packages about to be updated: ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64, - Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code) Dependencies Resolved Changes in packages about to be updated: ... (5 Replies)
Discussion started by: ashokvpp
5 Replies

9. Shell Programming and Scripting

Help with replace all the content within ()

Hi, Below is my input file : AAAG(12) TC(14) AACCCT(66) AACCCT(30) AACCCT(18) AACCCT(48) TCTG(12) TCTG(20) TCTG(16) AC(12) AC(12) TCTG(16) TCTG(12) AC(12) AC(12) AC(12) AC(26) AC(14) AGTG(12) AC(24) AGTG(12) TCC(12) Desired output : AAAG TC AACCCT AACCCT AACCCT AACCCT TCTG TCTG... (4 Replies)
Discussion started by: perl_beginner
4 Replies

10. Shell Programming and Scripting

Replace Content

Hello all ; ) I'got a file1 with a lot of emails like : fistname.lastname@domaine1.comAnd another file2 with emails like fistname.lastname@domaine2.ct.netI need a shell script that will read each line from the file1 and try to find if in file2 the fistname.lastname exist. If yes, the... (1 Reply)
Discussion started by: Aswex
1 Replies
X2SYS_DATALIST(1gmt)					       Generic Mapping Tools					      X2SYS_DATALIST(1gmt)

NAME
x2sys_datalist - A generic data-extractor for ASCII or binary files SYNOPSIS
x2sys_datalist track(s) -TTAG [ -A ] [ -Fname1,name2,... ] [ -H[i][nrec] ] [ -L[corrtable] ] [ -Rwest/east/south/north[r] ] [ -S ] [ -V ] [ -bo[s|S|d|D[ncol]|c[var1/...]] ] [ -m[flag] ] DESCRIPTION
x2sys_datalist reads one or more files and produces a single ASCII [or binary] table. The files can be of any format, which must be described and passed with the -T option. You may limit the output to a geographic region, and insist that the output from several files be separated by a multiple segment header. Only the named data fields will be output [Default selects all columns]. tracks Can be one or more ASCII, native binary, or COARDS netCDF 1-D data files. To supply the data files via a text file with a list of tracks (one per record), specify the name of the track list after a leading equal-sign (e.g., =tracks.lis). If the names are miss- ing their file extension we will append the suffix specified for this TAG. Track files will be searched for first in the current directory and second in all directories listed in $X2SYS_HOME/TAG/TAG_paths.txt (if it exists). [If $X2SYS_HOME is not set it will default to $GMT_SHAREDIR/x2sys]. (Note: MGD77 files will also be looked for via MGD77_HOME/mgd77_paths.txt and *.gmt files will be searched for via $GMT_SHAREDIR/mgg/gmtfile_paths). -T Specify the x2sys TAG which tracks the attributes of this data type. OPTIONS
No space between the option flag and the associated arguments. -A Eliminate COEs by distributing the COE between the two tracks in proportion to track weight. These (dist, adjustment) spline knots files for each track and data column are called track.column.adj and are expected to be in the $X2SYS_HOME/TAG directory. The adjustments are only applied if the corresponding adjust file can be found [No residual adjustments] -F Give a comma-separated sub-set list of column names defined in the definition file. [Default selects all data columns]. -H Input file(s) has header record(s). If used, the default number of header records is N_HEADER_RECS. Use -Hi if only input data should have header records [Default will write out header records if the input data have them]. Blank lines and lines starting with # are always skipped. -L Apply optimal corrections to columns where such corrections are available. Append the correction table to use [Default uses the correction table TAG_corrections.txt which is expected to reside in the $X2SYS_HOME/TAG directory]. For the format of this file, see CORRECTIONS below. -R west, east, south, and north specify the Region of interest, and you may specify them in decimal degrees or in [+-]dd:mm[:ss.xxx][W|E|S|N] format. Append r if lower left and upper right map coordinates are given instead of w/e/s/n. The two shorthands -Rg and -Rd stand for global domain (0/360 and -180/+180 in longitude respectively, with -90/+90 in latitude). Alterna- tively, specify the name of an existing grid file and the -R settings (and grid spacing, if applicable) are copied from the grid. For Cartesian data just give xmin/xmax/ymin/ymax. This option limits the COEs to those that fall inside the specified domain. -S Suppress output records where all the data columns are NaN [Default will output all records]. -V Selects verbose mode, which will send progress reports to stderr [Default runs "silently"]. -bo Selects binary output. Append s for single precision [Default is d (double)]. Uppercase S or D will force byte-swapping. Option- ally, append ncol, the number of desired columns in your binary output file. -m Output a multisegment header between data from each track. Note this option does not imply anything about the input file; that information is conveyed via the system tag (-T). EXAMPLES
To extract all data from the old-style MGG supplement file c2104.gmt, recognized by the tag GMT: x2sys_datalist c2104.gmt -TGMT > myfile To make lon,lat, and depth input for blockmean and surface using all the files listed in the file tracks.lis and define by the tag TRK, but only the data that are inside the specified area, and make output binary, run x2sys_datalist =tracks.lis -TTRK -Flon,lat,depth -R-40/-30/25/35 -bo > alltopo_bin.xyz CORRECTIONS
The correction table is an ASCII file with coefficients and parameters needed to carry out corrections. This table is usually produced by x2sys_solve. Comment records beginning with # are allowed. All correction records are of the form trackID observation correction where trackID is the track name, observation is one of the abbreviations for an observed field contained in files under this TAG, and cor- rection consists of one or more white-space-separated terms that will be subtracted from the observation before output. Each term must have this exact syntax: factor[*[function]([scale](abbrev[-origin]))[^power]] where terms in brackets are optional (the brackets themselves are not used but regular parentheses must be used exactly as indicated). No spaces are allowed except between terms. The factor is the amplitude of the basis function, while the optional function can be one of sin, cos, or exp. The optional scale and origin can be used to translate the argument (before giving it to the optional function). The argument abbrev is one of the abbreviations for columns known to this TAG. However, it can also be one of the three auxiliary terms dist (for along-track distances), azim for along-track azimuths, and vel (for along-track speed); these are all sensitive to the -C and -N settings used when defining the TAB; furthermore, vel requires time to be present in the data. If origin is given as T it means that we should replace it with the value of abbrev for the very first record in the file (this is usually only done for time). If the first data record entry is NaN we revert origin to zero. Optionally, raise the entire expression to the given power, before multiplying by factor. The fol- lowing is an example of fictitious corrections to the track ABC, implying the z column should have a linear trend removed, the field obs should be corrected by a strange dependency on latitude, weight needs to have 1 added (hence correction is given as -1), and fuel should be reduced by a linear distance term: ABC z 7.1 1e-4*((time-T)) ABC obs 0.5*exp(-1e-3(lat))^1.5 ABC weight -1 ABC fuel 0.02*((dist)) SEE ALSO
blockmean(1), GMT(1), surface(1), x2sys_init(1), x2sys_datalist(1), x2sys_get(1), x2sys_list(1), x2sys_put(1), x2sys_report(1), x2sys_solve(1) GMT 4.5.7 15 Jul 2011 X2SYS_DATALIST(1gmt)
All times are GMT -4. The time now is 11:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy