Sponsored Content
Full Discussion: Data processing
Top Forums Programming Data processing Post 302693503 by bfantinatti on Wednesday 29th of August 2012 09:58:09 AM
Old 08-29-2012
Data processing

Hello guys!
I have some issue in how to processing some data.
I have some files with 3 columns. The 1st column is a name of my sample. The 2nd column is a numerical sequence (very big sequence) starting from "1". And the 3rd column is a feature of each line, represented for a number (completely independent from the 2nd column). Something like this: (hypothetically)
Code:
scaffold_0 1 4
scaffold_0 2 4
scaffold_0 3 4
scaffold_0 4 6
scaffold_0 5 7
scaffold_0 6 7
scaffold_0 7 7
scaffold_0 8 7
scaffold_0 9 7

The problem is that when the value of 3rd column is zero, te line is not included in this file I have, generating something like this:
Code:
scaffold_0 1 4
scaffold_0 2 4
scaffold_0 8 7
scaffold_0 9 7

Note that the 2nd column jumps from 2 to 8 (the lines 3, 4, 5, 6 and 7 are not there because its respective 3rd column have a value = zero).

Question: Is there some command line that add the lines that are not present, resulting in something like this?
Code:
scaffold_0 1 4
scaffold_0 2 4
scaffold_0 3 0
scaffold_0 4 0
scaffold_0 5 0
scaffold_0 6 0
scaffold_0 7 0
scaffold_0 8 7
scaffold_0 9 7

Best regards..


Moderator's Comments:
Mod Comment Please use code tags next time for your code and data.

Last edited by zaxxon; 08-29-2012 at 11:14 AM.. Reason: code tags
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

data processing

hi i am having a file of following kind: 20015#67143645#143123#4214 62014#67143148#67143159#456 15432#67143568#00143862#4632 54112#67143752#0067143657#143 54623#67143357#167215#34531 65446#67143785#143598#7456 75642#67143546#156146#845 24464#67143465#172532#6544... (5 Replies)
Discussion started by: rochitsharma
5 Replies

2. UNIX for Dummies Questions & Answers

Data File Processing Help

I need to read contents of directory and create a list of data files that match a certain pattern and process by renaming it and calling a existing .ksh script then archiving off to file another directory. Any suggestions or samples u could point me to on using .ksh perl or other to process... (5 Replies)
Discussion started by: mavsman
5 Replies

3. Shell Programming and Scripting

How should i know that the process is still processing data

I have some process . How should i know that the process is still processing data or got hanged even though it is showing that it is running in background I know of a command called truss. how should i use this command and determine 1) process is still processing data 2) process got hanged... (7 Replies)
Discussion started by: ali560045
7 Replies

4. UNIX for Dummies Questions & Answers

a dummy question on data processing

Hi, everyone, I have a matrix, let's say: 1 2 3 4 5 6 ... 4 5 6 7 8 9 ... 7 8 9 1 2 3 ... 3 4 5 6 7 8 ... ......... (nxm matrix) Is there a simple command that can take certain specific rows out of the matrix? e.g., I want to take row 2 (4 5 6 7 8 9 ...) and row 4 (3 4 5 6 7 8... (2 Replies)
Discussion started by: kaixinsjtu
2 Replies

5. Shell Programming and Scripting

Help with data processing, maybe awk

I have a file, first 5 columns are very normal, like "1107",106027,71400,"Y","BIOLOGY",, however, the 6th columns, the user can put comments, anything, just any characters, like new line, double quote, single quote, whatever from the keyboard, like"Please load my previous SOM597G course content in... (3 Replies)
Discussion started by: freelong
3 Replies

6. UNIX for Dummies Questions & Answers

Genomic data processing

Dear fellow members, I've just joined the forum and am a newbie to shell scripting and programming. I'm stuck on the following problem. I'm working with large scale genomic data and need to do some analyses on it. Essentially it is text processing problem, so please don't mind the scientific... (0 Replies)
Discussion started by: mvaishnav
0 Replies

7. Shell Programming and Scripting

Data processing using awk

Hello, I have some bitrate data in a csv which is in an odd format and is difficult to process in Excel when I have thousands of rows. Therefore, I was thinking of doing this in bash and using awk as the primary application except that due to its complication, I'm a little stuck. ... (24 Replies)
Discussion started by: shadyuk
24 Replies

8. Shell Programming and Scripting

Data Processing

I have below Data *************************************************** ********************BEGINNING-1******************** directive url is : https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762 Src urls are :... (8 Replies)
Discussion started by: nikhil jain
8 Replies

9. UNIX for Beginners Questions & Answers

Processing files one by one using data from pipe

Hi guys, I receive a list from pipe (with fixed number of lines) like this: name1 name2 name3 And in my ./ folder I have three files: 01-oldname.test 02-someoldname.test 03-evenoldername.test How to rename files one by one using while read? Desired result: 01-name1.test 02-name2.test... (3 Replies)
Discussion started by: useretail
3 Replies
MRTG-LOGFILE(1) 						       mrtg							   MRTG-LOGFILE(1)

NAME
mrtg-logfile - description of the mrtg-2 logfile format SYNOPSIS
This document provides a description of the contents of the mrtg-2 logfile. OVERVIEW
The logfile consists of two main sections. The first Line It stores the traffic counters from the most recent run of mrtg. The rest of the File Stores past traffic rate averates and maxima at increassing intervals. The first number on each line is a unix time stamp. It represents the number of seconds since 1970. DETAILS
The first Line The first line has 3 numbers which are: A (1st column) A timestamp of when MRTG last ran for this interface. The timestamp is the number of non-skip seconds passed since the standard UNIX "epoch" of midnight on 1st of January 1970 GMT. B (2nd column) The "incoming bytes counter" value. C (3rd column) The "outgoing bytes counter" value. The rest of the File The second and remaining lines of the file contains 5 numbers which are: A (1st column) The Unix timestamp for the point in time the data on this line is relevant. Note that the interval between timestamps increases as you progress through the file. At first it is 5 minutes and at the end it is one day between two lines. This timestamp may be converted in OpenOffice Calc or MS Excel by using the following formula =(x+y)/86400+DATE(1970;1;1) (instead of ";" it may be that you have to use "," this depends on the context and your locale settings) you can also ask perl to help by typing perl -e 'print scalar localtime(x)," "' x is the unix timestamp and y is the offset in seconds from UTC. (Perl knows y). B (2nd column) The average incoming transfer rate in bytes per second. This is valid for the time between the A value of the current line and the A value of the previous line. C (3rd column) The average outgoing transfer rate in bytes per second since the previous measurement. D (4th column) The maximum incoming transfer rate in bytes per second for the current interval. This is calculated from all the updates which have occured in the current interval. If the current interval is 1 hour, and updates have occured every 5 minutes, it will be the biggest 5 minute transfer rate seen during the hour. E (5th column) The maximum outgoing transfer rate in bytes per second for the current interval. AUTHOR
Butch Kemper <kemper@bihs.net> and Tobias Oetiker <tobi@oetiker.ch> 2.17.4 2012-01-12 MRTG-LOGFILE(1)
All times are GMT -4. The time now is 12:36 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy