Split a line into multiple lines based on delimeters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split a line into multiple lines based on delimeters
# 1  
Old 08-19-2014
Split a line into multiple lines based on delimeters

Hi,

I need help to split any lines that contain ; or ,

input.txt
Code:
Ac020	 Not a good chemical process
AC030	 many has failed, 3 still maintained
AC040	 Putative; epithelial cells
AC050	 Predicted binding activity
AC060	 rodC Putative; upregulated in 48;h biofilm vs planktonic

The output should be:
Output.txt
Code:
Ac020	 Not a good chemical process
AC030	 many has failed 
AC030    3 still maintained
AC040	 Putative
AC040    epithelial cells
AC050	 Predicted binding activity
AC060	 rodC Putative
AC060    upregulated in 48 
AC060    h biofilm vs planktonic

I did below code but it does not give me the ID in first column for the splited ones

Code:
sed -e 's/\(.\), /\1\n\t\t /g' input.txt | sed -e 's/\(.\);/\1\n\t\t/g' > Output.txt

The result that I got is:

Code:
Ac020	 Not a good chemical process
AC030	 many has failed 
         3 still maintained
AC040	 Putative
         epithelial cells
AC050	 Predicted binding activity
AC060	 rodC Putative
         upregulated in 48 
         h biofilm vs planktonic


I don't know how should i do it to show the ID. Can anyone advise/help me on this? thanks

Last edited by rbatte1; 08-20-2014 at 10:20 AM.. Reason: Added ICODE tags and capitalised first person singular
# 2  
Old 08-19-2014
Try

Code:
$ cat file
Ac020	 Not a good chemical process
AC030	 many has failed, 3 still maintained
AC040	 Putative; epithelial cells
AC050	 Predicted binding activity
AC060	 rodC Putative; upregulated in 48;h biofilm vs planktonic

Code:
$ awk 'gsub(/[;,]/,RS $1 OFS) + 1' OFS='\t' file

Resulting
Code:
Ac020	 Not a good chemical process
AC030	 many has failed
AC030	 3 still maintained
AC040	 Putative
AC040	 epithelial cells
AC050	 Predicted binding activity
AC060	 rodC Putative
AC060	 upregulated in 48
AC060	h biofilm vs planktonic

This User Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 08-19-2014
Hi Akshay Hegde,

Thanks a bunch! It worked great.. Smilie
# 4  
Old 08-19-2014
This also will work bit lengthy

Code:
$ awk 'match($0,regex){ n=split(substr($0,length($1)+1),A,regex); for(i=1;i<=n;i++)print $1,A[i]; next }1' regex='[;,]' file

Code:
awk 'n=split(substr($0,length($1)+1),A,regex){for(i=1;i<=n;i++)print $1,A[i]; next }1' regex='[;,]' file

This User Gave Thanks to Akshay Hegde For This Post:
# 5  
Old 08-19-2014
Quote:
Originally Posted by Akshay Hegde
This also will work bit lengthy

Code:
$ awk 'match($0,regex){ n=split(substr($0,length($1)+1),A,regex); for(i=1;i<=n;i++)print $1,A[i]; next }1' regex='[;,]' file

Code:
awk 'n=split(substr($0,length($1)+1),A,regex){for(i=1;i<=n;i++)print $1,A[i]; next }1' regex='[;,]' file

Yeah.. tried both and it works great too. A little bit complicated to understand compared to the first one. But I am glad that it gives something for me to think of. Thanks. Smilie

Last edited by rbatte1; 08-20-2014 at 11:08 AM.. Reason: Capitalised first person singular
# 6  
Old 08-19-2014
You can also do this with sed:

Code:
sed -E ':a ; s/^([^ \t]+[ \t]+)([^,;]+)[,;][ \t]*/\1\2\n\1/; ta' infile


Last edited by Chubler_XL; 08-19-2014 at 06:04 PM.. Reason: Match space or tab
This User Gave Thanks to Chubler_XL For This Post:
# 7  
Old 08-19-2014
Quote:
Originally Posted by Chubler_XL
You can also do this with sed:

Code:
sed -E ':a ; s/^([^ \t]+[ \t]+)([^,;]+)[,;][ \t]*/\1\2\n\1/; ta' infile

Yeah, it worked great too.. Smilie Thanks Chubler_XL..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

2. UNIX for Beginners Questions & Answers

Split file into multiple files based on empty lines

I am using below code to split files based on blank lines but it does not work. awk 'BEGIN{i=0}{RS="";}{x="F"++i;}{print > x;}' Your help would be highly appreciated find attachment of sample.txt file (2 Replies)
Discussion started by: imranrasheedamu
2 Replies

3. Shell Programming and Scripting

Sort file based on number of delimeters in line

Hi, Need to sort file based on the number of delimeters in the lines. cat testfile /home/oracle/testdb /home /home/oracle/testdb/newdb /home/oracle Here delimeter is "/" expected Output: /home/oracle/testdb/newdb /home/oracle/testdb /home/oracle /home (3 Replies)
Discussion started by: Sumanthsv
3 Replies

4. Shell Programming and Scripting

Returning multiple outputs of a single line based on previous repeated lines

Hello, I am trying to return a time multiple times from a file that has varying output just before the time instance, i.e. cat jumped cat jumped cat jumped time = 1.1 cat jumped cat jumped time = 1.2 cat jumped cat jumped time = 1.3 In this case i would like to output a time.txt... (6 Replies)
Discussion started by: ryddner
6 Replies

5. UNIX for Dummies Questions & Answers

Split file based on number of blank lines

Hello All , I have a file which needs to split based on the blank lines Name ABC Address London Age 32 (4 blank new line) Name DEF Address London Age 30 (4 blank new line) Name DEF Address London (8 Replies)
Discussion started by: Pratik4891
8 Replies

6. Shell Programming and Scripting

Copying lines from multiple logfiles, based on content of the line

d df d d (1 Reply)
Discussion started by: larsk
1 Replies

7. Shell Programming and Scripting

Split a file into multiple files based on line numbers and first column value

Hi All I have one query,say i have a requirement like the below code should be move to diffent files whose maximum lines can be of 10 lines.Say in the below example,it consist of 14 lines. This should be moved logically using the data in the fisrt coloumn to file1 and file 2.The data of first... (2 Replies)
Discussion started by: sarav.shan
2 Replies

8. UNIX for Dummies Questions & Answers

Command to split the files based on the number of lines in it

Hello Friends, Can anyone help me for the below requirement. I am having a file called Input.txt. My requirement is first check the count that is wc -l input.txt If the result of the wc -l Input.txt is less than 10 then don't split the Input.txt file. Where as if Input.txt >= 10 the split... (12 Replies)
Discussion started by: malaya kumar
12 Replies

9. Shell Programming and Scripting

Script to split files based on number of lines

I am getting a few gzip files into a folder by doing ftp to another server. Once I get them I move them to another location .But before that I need to make sure each gzip is not more than 5000 lines and split it up . The files I get are anywhere from 500 lines to 10000 lines in them and is in gzip... (4 Replies)
Discussion started by: gubbu
4 Replies

10. Shell Programming and Scripting

Split a huge line into multiple 120 characters lines with sed?

Hello , I'm trying to split a file which contains a single very long line. My aim is to split this single line each 120 characters. I tried with the sed command : `cat ${MYPATH}/${FILE}|sed -e :a -e 's/^.\{1,120\}$/&\n/;ta' >{MYPATH}/${DEST}` but when I wc -l the destination file it is... (2 Replies)
Discussion started by: jerome_1664
2 Replies
Login or Register to Ask a Question