Sponsored Content
Top Forums UNIX for Beginners Questions & Answers How to sum value of a column by range defined in another file awk? Post 303031536 by yifangt on Friday 1st of March 2019 02:57:57 PM
Old 03-01-2019
What overlapping problem? Using column 3, those do not overlap. Are we intended to count column 2 as well?
What I meant "overlapping" was for the range, for example:

this line N1 48 181 2 could overlap with two ranges:
Code:
N1   0    99   ?
N1 100   199   ?

so I just ignore columns 3 (181) and categorize it to range N1 0 99.
Are all of them N1?
No, N1 means chromosome N1, so that there are 50 different strings, N1, N19, Scaff01 ... Sorry, I should provide a better sample with at least two chromosomes.
Code:
file1.table
N1    0    48    0
N1    48    181    2
N1    181    193    0
N1    193    326    2
N1    326    457    0
N1    457    471    1
N1    471    590    2
N1    590    604    1
N1    604    752    0
N1    752    875    1
N2    0    580    0
N2    580    592    1
N2    592    713    2
N2    568    627    1
N2    627    698    2
N2    698    701    3
N2    701    717    2
N2    713    724    1
N2    717    726    3

Code:
file2.range
chr start end 
N1      0     99   
N1    100    199 
N1    200    299
N1    300    399 
N1    400    499 
N1    500    599 
N1    600    699 
N1    700    799 
N1    800    899 
N1    900    999
N2      0     99 
N2    100    199 
N2    200    299 
N2    300    399 
N2    400    499 
N2    500    599 
N2    600    699 
N2    700    799 
N2    800    899 
N2    900    999

And output:
Code:
chr start end  count 
N1      0     99    3 
N1    100    199    2 
N1    200    299    0 
N1    300    399    0 
N1    400    499    3 
N1    500    599    1 
N1    600    699    1 
N1    700    799    0 
N1    800    899    0 
N1    900    999    0
N2      0     99    0 
N2    100    199    0 
N2    200    299    0 
N2    300    399    0 
N2    400    499    0 
N2    500    599    3 
N2    600    699    5 
N2    700    799    6 
N2    800    899    0 
N2    900    999    0


Last edited by yifangt; 03-01-2019 at 04:13 PM.. Reason: typo and markdown change
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Log File date compare for user defined range

:confused: Hi i am a noob and need a little help to finish my shell script. I am learning as i go but hit a problem. I am search thorugh logs(*.rv) files to find entires between two user defined dates, The script so far looks for the "START" and "END" of each entry at sees if it belongs To... (0 Replies)
Discussion started by: mojo24
0 Replies

2. Shell Programming and Scripting

give column range in awk

hi all, I generally give an awk command to print multiple columns like this: awk -F~ '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13}' test.txt can't we give a range like : awk -F~ '{OFS=",";print $1-$13}' ( I know this will subtract column 13 from 1) or awk -F~... (1 Reply)
Discussion started by: sumeet
1 Replies

3. UNIX for Dummies Questions & Answers

Column containing sum using awk

Hi All, I am trying to add a column that contains the sum of the previous column repeated. IE 1 2 3 4 I would like to get: 1 10 2 10 3 10 4 10 Advice? I can get 1 1 2 3 3 6 (4 Replies)
Discussion started by: baconbasher
4 Replies

4. Shell Programming and Scripting

help sum columns by break in first column with awk or sed or something.

I have some data that is something like this? item: onhand counted location ITEM0001 1 0 a1 ITEM0001 0 1 a2 ITEM0002 5 0 b5 ITEM0002 0 6 c1 I want to sum up... (6 Replies)
Discussion started by: syadnom
6 Replies

5. Shell Programming and Scripting

awk count characters, sum, and divide by another column

Hi All, I am another biologist attempting to parse a large txt file containing several million lines like: tucosnp 56762 T Y 228 228 60 23 .CcCcc,,..c.c,cc,,.C... What I need to do is get the frequency of periods (.) plus commas (,) in column 9, and populate this number into another... (1 Reply)
Discussion started by: peromhc
1 Replies

6. Shell Programming and Scripting

How to sum multiple column output with awk ?

Hi Experts, I am trying to sum multiple columns and rows with awk , I want the sum of : 1] Horizontal Sum: (rows sum): 2] Vertical Sum: (Column's sum] details: # cat file1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 40 31 32 33 34 35 36 37 38 39 70 41 42 43 44... (2 Replies)
Discussion started by: rveri
2 Replies

7. Shell Programming and Scripting

Sum of 286th column using awk in a file

Hi, I am using the following code to find the sum of the values of column 286 in a file. It will have the Decimal values with the scale of 2. Delimiter is '|^' cut -d'|^' -f286 filename|cut -c3-| awk '{ x += $1 } END { printf("%.2f\n", x) }' There are around 50k records in this file... (2 Replies)
Discussion started by: Jram
2 Replies

8. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

9. Shell Programming and Scripting

Sum values of specific column in multiple files, considering ranges defined in another file

I have a file (let say file B) like this: File B: A1 3 5 A1 7 9 A2 2 5 A3 1 3 The first column defines a filename and the other two define a range in that specific file. In the same directory, I have also three more files (File A1, A2 and A3). Here is 10 sample lines... (3 Replies)
Discussion started by: Bastami
3 Replies

10. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies
ogrinfo(1)						      General Commands Manual							ogrinfo(1)

NAME
ogrinfo - ogrinfo lists information about an OGR supported data source SYNOPSIS
ogrinfo [--help-general] [-ro] [-q] [-where restricted_where] [-spat xmin ymin xmax ymax] [-fid fid] [-sql statement] [-dialect dialect] [-al] [-so] [-fields={YES/NO}] [-geom={YES/NO/SUMMARY}][--formats] datasource_name [layer [layer ...]].fi DESCRIPTION
The ogrinfo program lists various information about an OGR supported data source to stdout (the terminal). -ro: Open the data source in read-only mode. -al: List all features of all layers (used instead of having to give layer names as arguments). -so: Summary Only: supress listing of features, show only the summary information like projection, schema, feature count and extents. -q: Quiet verbose reporting of various information, including coordinate system, layer schema, extents, and feature count. -where restricted_where: An attribute query in a restricted form of the queries used in the SQL WHERE statement. Only features matching the attribute query will be reported. -sql statement: Execute the indicated SQL statement and return the result. -dialect dialect: SQL dialect. In some cases can be used to use (unoptimized) OGR SQL instead of the native SQL of an RDBMS by passing OGRSQL. -spat xmin ymin xmax ymax: The area of interest. Only features within the rectangle will be reported. -fid fid: If provided, only the feature with this feature id will be reported. Operates exclusive of the spatial or attribute queries. Note: if you want to select several features based on their feature id, you can also use the fact the 'fid' is a special field recognized by OGR SQL. So, '-where 'fid in (1,3,5)'' would select features 1, 3 and 5. -fields={YES/NO}: (starting with GDAL 1.6.0) If set to NO, the feature dump will not display field values. Default value is YES. -geom={YES/NO/SUMMARY}: (starting with GDAL 1.6.0) If set to NO, the feature dump will not display the geometry. If set to SUMMARY, only a summary of the geometry will be displayed. If set to YES, the geometry will be reported in full OGC WKT format. Default value is YES. --formats: List the format drivers that are enabled. datasource_name: The data source to open. May be a filename, directory or other virtual name. See the OGR Vector Formats list for supported datasources. layer: One or more layer names may be reported. If no layer names are passed then ogrinfo will report a list of available layers (and their layerwide geometry type). If layer name(s) are given then their extents, coordinate system, feature count, geometry type, schema and all features matching query parameters will be reported to the terminal. If no query parameters are provided, all features are reported. Geometries are reported in OGC WKT format. EXAMPLE
Example reporting all layers in an NTF file: % ogrinfo wrk/SHETLAND_ISLANDS.NTF INFO: Open of `wrk/SHETLAND_ISLANDS.NTF' using driver `UK .NTF' successful. 1: BL2000_LINK (Line String) 2: BL2000_POLY (None) 3: BL2000_COLLECTIONS (None) 4: FEATURE_CLASSES (None) Example using an attribute query is used to restrict the output of the features in a layer: % ogrinfo -ro -where 'GLOBAL_LINK_ID=185878' wrk/SHETLAND_ISLANDS.NTF BL2000_LINK INFO: Open of `wrk/SHETLAND_ISLANDS.NTF' using driver `UK .NTF' successful. Layer name: BL2000_LINK Geometry: Line String Feature Count: 1 Extent: (419794.100000, 1069031.000000) - (419927.900000, 1069153.500000) Layer SRS WKT: PROJCS["OSGB 1936 / British National Grid", GEOGCS["OSGB 1936", DATUM["OSGB_1936", SPHEROID["Airy 1830",6377563.396,299.3249646]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433]], PROJECTION["Transverse_Mercator"], PARAMETER["latitude_of_origin",49], PARAMETER["central_meridian",-2], PARAMETER["scale_factor",0.999601272], PARAMETER["false_easting",400000], PARAMETER["false_northing",-100000], UNIT["metre",1]] LINE_ID: Integer (6.0) GEOM_ID: Integer (6.0) FEAT_CODE: String (4.0) GLOBAL_LINK_ID: Integer (10.0) TILE_REF: String (10.0) OGRFeature(BL2000_LINK):2 LINE_ID (Integer) = 2 GEOM_ID (Integer) = 2 FEAT_CODE (String) = (null) GLOBAL_LINK_ID (Integer) = 185878 TILE_REF (String) = SHETLAND I LINESTRING (419832.100 1069046.300,419820.100 1069043.800,419808.300 1069048.800,419805.100 1069046.000,419805.000 1069040.600,419809.400 1069037.400,419827.400 1069035.600,419842 1069031,419859.000 1069032.800,419879.500 1069049.500,419886.700 1069061.400,419890.100 1069070.500,419890.900 1069081.800,419896.500 1069086.800,419898.400 1069092.900,419896.700 1069094.800,419892.500 1069094.300,419878.100 1069085.600,419875.400 1069087.300,419875.100 1069091.100,419872.200 1069094.600,419890.400 1069106.400,419907.600 1069112.800,419924.600 1069133.800,419927.900 1069146.300,419927.600 1069152.400,419922.600 1069153.500,419917.100 1069153.500,419911.500 1069153.000,419908.700 1069152.500,419903.400 1069150.800,419898.800 1069149.400,419894.800 1069149.300,419890.700 1069149.400,419890.600 1069149.400,419880.800 1069149.800,419876.900 1069148.900,419873.100 1069147.500,419870.200 1069146.400,419862.100 1069143.000,419860 1069142,419854.900 1069138.600,419850 1069135,419848.800 1069134.100,419843 1069130,419836.200 1069127.600,419824.600 1069123.800,419820.200 1069126.900,419815.500 1069126.900,419808.200 1069116.500,419798.700 1069117.600,419794.100 1069115.100,419796.300 1069109.100,419801.800 1069106.800,419805.000 1069107.300) AUTHORS
Frank Warmerdam warmerdam@pobox.com, Silke Reimer silke@intevation.de GDAL
Tue Sep 18 2012 ogrinfo(1)
All times are GMT -4. The time now is 09:47 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy