Sponsored Content
Top Forums Shell Programming and Scripting Output on one line using awk or sed Post 302950202 by cmccabe on Wednesday 22nd of July 2015 11:35:55 AM
Old 07-22-2015
Output on one line using awk or sed

I have a file of 100,000 lines in the below format:

answer.bed
Code:
chr1    957570    957852    
     NOC2L
chr1    976034    976270     
     PERM1
chr1    976542    976787    
     PERM1

I need to get each on one line and so far what I have tried doesn't seem to be working. Thank you Smilie.

Desired output:

Code:
chr1    957570    957852     NOC2L
chr1    976034    976270     PERM1
chr1    976542    976787     PERM1

I have tried:
Code:
 sed 'N;s/\n//' answer.bed > output.bed

output.txt
Code:
chr1    957570    957852     NOC2Lchr1    976034    976270     PERM1
chr1    976542    976787     PERM1chr1    978907    979122     PERM1
chr1    979192    979413     PERM1chr1    979478    979647     PERM1


Code:
 awk 'NR%2{printf $0" ";next;}1' answer.bed > output.bed

output.txt
Code:
chr1    957570    957852     NOC2L chr1    976034    976270     PERM1
chr1    976542    976787     PERM1 chr1    978907    979122     PERM1
chr1    979192    979413     PERM1 chr1    979478    979647     PERM1

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read logline line by line with awk/sed

Hello, I have a logfile which is in this format: 1211667249500#3265 1211667266687#2875 1211667270781#1828 Is there a way to read the logfile line by line every time I execute the code and put the two numbers in the line in two separate variables? Something like: 1211667249500#3265... (7 Replies)
Discussion started by: dejavu88
7 Replies

2. UNIX for Dummies Questions & Answers

sed output without a new line

I am reading file and extracting the paragraph between START and END tags. contents of abc.txt Remember that $ means the last line in a file. You can also specify a range based on two regexps. Try START Note that this prints all blocks starting with lines containing regexp1 through lines... (1 Reply)
Discussion started by: ganesh_mak
1 Replies

3. Shell Programming and Scripting

single line input to multiple line output with sed

hey gents, I'm working on something that will use snmpwalk to query the devices on my network and retreive the device name, device IP, device model and device serial. I'm using Nmap for the enumeration and sed to clean up the results for use by snmpwalk. Once i get all the data organized I'm... (8 Replies)
Discussion started by: mitch
8 Replies

4. Shell Programming and Scripting

awk;sed appending line to previous line....

I know this has been asked before but I just can't parse the syntax as explained. I have a set of files that has user information spread out over two lines that I wish to merge into one: User1NameLast User1NameFirst User1Address E-Mail:User1email User2NameLast User2NameFirst User2Address... (11 Replies)
Discussion started by: walkerwheeler
11 Replies

5. Shell Programming and Scripting

sed and Output line too long

hello i try this command in console mode sed -e :a -e '/$/N; s/\(\)\n/\1 /; ta' test.txt > result.txt i have in the output screen "Output line too long" for multiples lines can you please tell me how can i retrieve those long lines during the execution ? Another thing very... (5 Replies)
Discussion started by: ade05fr
5 Replies

6. Shell Programming and Scripting

Insert output from a file to beginning of line with sed

Hi I've been trying to search but couldn't quite get the answer I was looking for. I have a a file that's like this Time, 9/1/12 0:00, 1033 0:10, 1044 ... 23:50, 1050 How do I make it so the file will be like this? 9/1/12, 0:00, 1033 9/1/12, 0:10, 1044 ... 9/1/12, 23:50, 1050 I... (4 Replies)
Discussion started by: diesel88
4 Replies

7. Shell Programming and Scripting

sed and awk giving error ./sample.sh: line 13: sed: command not found

Hi, I am running a script sample.sh in bash environment .In the script i am using sed and awk commands which when executed individually from terminal they are getting executed normally but when i give these sed and awk commands in the script it is giving the below errors :- ./sample.sh: line... (12 Replies)
Discussion started by: satishmallidi
12 Replies

8. Shell Programming and Scripting

Problem with output awk and sed

I have file, i am extracting email address from file. but problem is that output is very ugly. I am using this command REMOVED "CSS OFFENDING CODE"... While original filename have no such character. Please suggest. (20 Replies)
Discussion started by: learnbash
20 Replies

9. Shell Programming and Scripting

Multiple line search, replace second line, using awk or sed

All, I appreciate any help you can offer here as this is well beyond my grasp of awk/sed... I have an input file similar to: &LOG &LOG Part: "@DB/TC10000021855/--F" &LOG &LOG &LOG Part: "@DB/TC10000021852/--F" &LOG Cloning_Action: RETAIN &LOG Part: "@DB/TCCP000010713/--A" &LOG &LOG... (5 Replies)
Discussion started by: KarmaPoliceT2
5 Replies

10. Shell Programming and Scripting

sed command to replace a line in a file using line number from the output of a pipe.

Sed command to replace a line in a file using line number from the output of a pipe. Is it possible to replace a whole line piped from someother command into a file at paritcular line... here is some basic execution flow.. the line number is 412 lineNo=412 Now i have a line... (1 Reply)
Discussion started by: vivek d r
1 Replies
tabix(1)						       Bioinformatics tools							  tabix(1)

NAME
bgzip - Block compression/decompression utility tabix - Generic indexer for TAB-delimited genome position files SYNOPSIS
bgzip [-cdhB] [-b virtualOffset] [-s size] [file] tabix [-0lf] [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol] [-S lineSkip] [-c metaChar] in.tab.bgz [region1 [region2 [...]]] DESCRIPTION
Tabix indexes a TAB-delimited genome position file in.tab.bgz and creates an index file in.tab.bgz.tbi when region is absent from the com- mand-line. The input data file must be position sorted and compressed by bgzip which has a gzip(1) like interface. After indexing, tabix is able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally. OPTIONS OF TABIX
-p STR Input format for indexing. Valid values are: gff, bed, sam, vcf and psltab. This option should not be applied together with any of -s, -b, -e, -c and -0; it is not used for data retrieval because this setting is stored in the index file. [gff] -s INT Column of sequence name. Option -s, -b, -e, -S, -c and -0 are all stored in the index file and thus not used in data retrieval. [1] -b INT Column of start chromosomal position. [4] -e INT Column of end chromosomal position. The end column can be the same as the start column. [5] -S INT Skip first INT lines in the data file. [0] -c CHAR Skip lines started with character CHAR. [#] -0 Specify that the position in the data file is 0-based (e.g. UCSC files) rather than 1-based. -h Print the header/meta lines. -B The second argument is a BED file. When this option is in use, the input file may not be sorted or indexed. The entire input will be read sequentially. Nonetheless, with this option, the format of the input must be specificed correctly on the command line. -f Force to overwrite the index file if it is present. -l List the sequence names stored in the index file. EXAMPLE
(grep ^"#" in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip > sorted.gff.gz; tabix -p gff sorted.gff.gz; tabix sorted.gff.gz chr1:10,000,000-20,000,000; NOTES
It is straightforward to achieve overlap queries using the standard B-tree index (with or without binning) implemented in all SQL data- bases, or the R-tree index in PostgreSQL and Oracle. But there are still many reasons to use tabix. Firstly, tabix directly works with a lot of widely used TAB-delimited formats such as GFF/GTF and BED. We do not need to design database schema or specialized binary formats. Data do not need to be duplicated in different formats, either. Secondly, tabix works on compressed data files while most SQL databases do not. The GenCode annotation GTF can be compressed down to 4%. Thirdly, tabix is fast. The same indexing algorithm is known to work effi- ciently for an alignment with a few billion short reads. SQL databases probably cannot easily handle data at this scale. Last but not the least, tabix supports remote data retrieval. One can put the data file and the index at an FTP or HTTP server, and other users or even web services will be able to get a slice without downloading the entire file. AUTHOR
Tabix was written by Heng Li. The BGZF library was originally implemented by Bob Handsaker and modified by Heng Li for remote file access and in-memory caching. SEE ALSO
samtools(1) tabix-0.2.0 11 May 2010 tabix(1)
All times are GMT -4. The time now is 10:34 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy