Sponsored Content
Top Forums Shell Programming and Scripting Seperated by columns, merge in a file, sort them on common column Post 302911584 by Mannu2525 on Friday 1st of August 2014 02:02:00 PM
Old 08-01-2014
Seperated by columns, merge in a file, sort them on common column

Hi All,

I have 4 files in below format. I took them as an example.

File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT.
Code:
 
CTOT 456787897 Low fever
CTOR 556712345 High fever

File 2: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting D as initial.
Code:
CTWE 456711111 Some Risk
CTWR 551124567 High Risk

File 3: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Insert L as initial.
Code:
CTRE 459754125 Doctor Required
CTEE 559754214 Home Sickness

File 4: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file.Insert M as initial.
Code:
CRRE 458757845 Hospital Required
CREE 558757812 Hospital Not required

Column 2 become common column in all 4 files.

Its easy to cut columns usind cut -c command and insertion using sed command but not get the desired output while merge them as columns. I used paste command for that but while pasting I see one blank column in between two columns like CTOT<space><space>45.

I would see my output as:
Code:
HCTOT 45 6787897 Low fever
DCTWE 45 6711111 Some Risk
LCTRE 45 9754125 Doctor Required
MCRRE 45 8757845 Hospital Required
HCTOR 55 6712345 High fever
DCTWR 55 1124567 High Risk
LCTEE 55 9754214 Home Sickness
MCREE 55 8757812 Hospital Not required

I used sorting after that and combined all my logic into a shell script but I didn't get the desired output. Please help me out.
Thank you.

Last edited by Mannu2525; 08-01-2014 at 03:03 PM.. Reason: formatting
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

merge rows based on a common column

Hi guys, Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output. I have a file with the following content- A123 box1 B345 bat2 C431 my_id A123 service C431 box1 A123 my_id I need two different outputs- OUTPUT1 A123... (6 Replies)
Discussion started by: smriti_shridhar
6 Replies

2. UNIX for Dummies Questions & Answers

find common lines using just one column to compare and result with all columns

Hi. If we have this file A B C 7 8 9 1 2 10 and this other file A C D F 7 9 2 3 9 2 3 4 The result i´m looking for is intersection with A B C D F so the answer here will be (10 Replies)
Discussion started by: alcalina
10 Replies

3. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

4. UNIX for Dummies Questions & Answers

Merge rows with common column

Dear all I have big file with two columns A_AA960715 GO:0006952 A_AA960715 GO:0008152 A_AA960715 GO:0016491 A_AA960715 GO:0007165 A_AA960715 GO:0005618 A_AA960716 GO:0006952 A_AA960716 GO:0005618 A_AA960716... (15 Replies)
Discussion started by: AAWT
15 Replies

5. Shell Programming and Scripting

file merge based on common columns

I have two files 1.txt 34, ABC, 7, 8, 0.9 35, CDE, 6.5, -2, 0.01 2.txt 34, ABC, 9, 6, -1.9 35, CDE, 8.5, -2.3, 5.01 So in both files common columns are 1 and 2 so final o/p should look like 34, ABC, 7, 8, 0.9, 9, 6, -1.9 35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01 I tried using... (3 Replies)
Discussion started by: manas_ranjan
3 Replies

6. Shell Programming and Scripting

Count and merge using common column

I have the following records from multiple files. 415 A G 415 A G 415 A T 415 A . 415 A . 421 G A 421 G A,C 421 G A 421 G A 421 G A,C 421 G . 427 A C 427 A ... (3 Replies)
Discussion started by: empyrean
3 Replies

7. Shell Programming and Scripting

Merge with common column

hi i have two files and i wanted to join them using common column. try to do this using "join" command but that did not help. File 1: 123 9a.vcf hy92.vcf hy90.vcf Index Ref Alt Ref Alt Ref Alt 315 14 0 7 4 ... (6 Replies)
Discussion started by: empyrean
6 Replies

8. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

9. Shell Programming and Scripting

Paste columns based on common column: multiple files

Hi all, I've multiple files. In this case 5. Space separated columns. Each file has 12 columns. Each file has 300-400K lines. I want to get the output such that if a value in column 2 is present in all the files then get all the columns of that value and print it side by side. Desired output... (15 Replies)
Discussion started by: genome
15 Replies

10. Shell Programming and Scripting

Seperated a Column from 'ESC' Character seperated file

Hi Experts I have an escape seperated fields in the unix file. And in the below format file I need to extract the first column. Please help its urgent. cat -v op.dat | head 24397028^ I want to extract the file in below format ( with only first column ) 24397028 2439707 thanks. ... (6 Replies)
Discussion started by: neha_suri06
6 Replies
eucset(1)						      General Commands Manual							 eucset(1)

NAME
eucset - Sets and gets EUC code widths for the terminal SYNOPSIS
eucset [cswidth] eucset -p OPTIONS
Displays the current settings of the EUC character widths for the terminal DESCRIPTION
The eucset command sets or gets the encoding and display widths of the Extended UNIX Code (EUC) characters processed by the current input terminal. EUC is an encoding method for code sets composed of single or multiple bytes. It permits applications and the terminal hardware to use the 7-bit US ASCII code and up to three single- or multibyte code sets simultaneously. If you use the eucset command to set EUC character widths, but do not specify the cswidth argument, 7-bit U.S. ASCII is applied as a default code set. You must use the command to specify any other EUC code sets, whether they are single-byte or multibyte. EUC Code Set Classes EUC divides code sets into four classes. Each code set class has two characteristics: the number of bytes for encoding the characters in the class, and the number of display columns to display the characters in the class. All characters within a class possess the same char- acteristics. Class 0 consists of all 7-bit, single-byte ASCII characters. The most-significant bit of each of these characters is 0 (zero). Characters in class 0 require one byte for encoding, and occupy one display column. These values are fixed for class 0 (zero). The 7-bit US ASCII code is the primary EUC code set, which is available to users without direct specification. A class 1 code set is a supplementary EUC code set. Class 1 characters have an initial byte whose most-significant bit is 1. If character classes 2 or 3 are to be used, this initial byte must not be the SS2 or SS3 character, as these designate character classes 2 and 3. Char- acters in class 1 may require more than 1 byte for encoding, and may require more than 1 display column. The eucset command must be used to set the characteristics for code set class 1. Class 2 and 3 code sets are supplementary EUC code sets. Characters in these classes have an initial byte of SS2 or SS3, respectively. They require more than 1 byte for encoding, and may require more than 1 display column. The eucset command must be used to set the charac- teristics for code set classes 2 and 3. The cswidth argument in the eucset command line is a character string that describes the character widths for code set classes 1 through 3. The string is of the following format: X1[:Y1], X2[:Y2], X3[:Y3] The value X1 is the number of bytes required to encode a character in code set class 1. Y1 is the number of display columns needed to dis- play characters in this class. X2 is the number of bytes required to encode a character in code set class 2, not counting the SS2 byte, and Y2 is the number of display columns for code set class 2 characters. X3 is the number of bytes needed to encode characters in code set class 3, not counting the SS3 byte, and Y3 is the number of display columns required for these characters. The values for the column widths can be omitted if they are equal to the number of encoding bytes. If the encoding value of any of the EUC code sets is set to 0 (zero), this indicates that the code set does not exist. If no cswidth argument is supplied, the eucset command uses the value of the CSWIDTH environment variable. If this variable is not present, the default string 1:1,0:0,0:0 is substituted. This default string designates that the environment uses a single-byte EUC code set that has characters in the EUC code set class 1 format. If the environment uses a multibyte EUC code set in the code set class 1 format, single- or multibyte EUC code sets in the code set class 2 or 3 format, or both, the default setting cannot be used. DIAGNOSTICS
Your standard input is not an interactive terminal. The maximum character width of 8 was exceeded. EXAMPLES
To display the encoding and display widths for the EUC code set classes 1-3 in your environment, enter: eucset -p To change the current settings of the encoding and display widths for the EUC characters in code set classes 1 and 2 to 2 bytes each, enter: eucset 2:2,2:2,0:0 or eucset 2,2,0 SEE ALSO
Interfaces: eucioctl(7) eucset(1)
All times are GMT -4. The time now is 10:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy