Seperated by columns, merge in a file, sort them on common column
Hi All,
I have 4 files in below format. I took them as an example.
File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT.
File 2: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting D as initial.
File 3: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Insert L as initial.
File 4: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file.Insert M as initial.
Column 2 become common column in all 4 files.
Its easy to cut columns usind cut -c command and insertion using sed command but not get the desired output while merge them as columns. I used paste command for that but while pasting I see one blank column in between two columns like CTOT<space><space>45.
I would see my output as:
I used sorting after that and combined all my logic into a shell script but I didn't get the desired output. Please help me out.
Thank you.
Last edited by Mannu2525; 08-01-2014 at 03:03 PM..
Reason: formatting
Hi guys,
Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output.
I have a file with the following content-
A123 box1
B345 bat2
C431 my_id
A123 service
C431 box1
A123 my_id
I need two different outputs-
OUTPUT1
A123... (6 Replies)
Hi.
If we have this file
A B C
7 8 9
1 2 10
and this other file
A C D F
7 9 2 3
9 2 3 4
The result i´m looking for is intersection with A B C D F
so the answer here will be (10 Replies)
I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns.
I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Dear all
I have big file with two columns
A_AA960715 GO:0006952
A_AA960715 GO:0008152
A_AA960715 GO:0016491
A_AA960715 GO:0007165
A_AA960715 GO:0005618
A_AA960716 GO:0006952
A_AA960716 GO:0005618
A_AA960716... (15 Replies)
I have two files
1.txt
34, ABC, 7, 8, 0.9
35, CDE, 6.5, -2, 0.01
2.txt
34, ABC, 9, 6, -1.9
35, CDE, 8.5, -2.3, 5.01
So in both files common columns are 1 and 2
so final o/p should look like
34, ABC, 7, 8, 0.9, 9, 6, -1.9
35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01
I tried using... (3 Replies)
I have the following records from multiple files.
415 A G
415 A G
415 A T
415 A .
415 A .
421 G A
421 G A,C
421 G A
421 G A
421 G A,C
421 G .
427 A C
427 A ... (3 Replies)
hi i have two files and i wanted to join them using common column. try to do this using "join" command but that did not help.
File 1:
123 9a.vcf hy92.vcf hy90.vcf
Index Ref Alt Ref Alt Ref Alt
315 14 0 7 4 ... (6 Replies)
Hi, I am trying to selectively merge two files based on keys reported in the 1st column.
File1:
#file1-header1
file1-header2
111 qwe rtz uio
198 asd fgh jkl
165 yxc
789 poi uzt rew
89 lkj
File2:
#file2-header2
file2-header2
165 ghz nko2 ... (2 Replies)
Hi all,
I've multiple files. In this case 5. Space separated columns. Each file has 12 columns. Each file has 300-400K lines.
I want to get the output such that if a value in column 2 is present in all the files then get all the columns of that value and print it side by side.
Desired output... (15 Replies)
Hi Experts
I have an escape seperated fields in the unix file. And in the below format file I need to extract the first column. Please help its urgent.
cat -v op.dat | head
24397028^
I want to extract the file in below format ( with only first column )
24397028
2439707
thanks.
... (6 Replies)
Discussion started by: neha_suri06
6 Replies
LEARN ABOUT OSF1
eucset
eucset(1) General Commands Manual eucset(1)NAME
eucset - Sets and gets EUC code widths for the terminal
SYNOPSIS
eucset [cswidth]
eucset -p
OPTIONS
Displays the current settings of the EUC character widths for the terminal
DESCRIPTION
The eucset command sets or gets the encoding and display widths of the Extended UNIX Code (EUC) characters processed by the current input
terminal. EUC is an encoding method for code sets composed of single or multiple bytes. It permits applications and the terminal hardware
to use the 7-bit US ASCII code and up to three single- or multibyte code sets simultaneously.
If you use the eucset command to set EUC character widths, but do not specify the cswidth argument, 7-bit U.S. ASCII is applied as a
default code set. You must use the command to specify any other EUC code sets, whether they are single-byte or multibyte.
EUC Code Set Classes
EUC divides code sets into four classes. Each code set class has two characteristics: the number of bytes for encoding the characters in
the class, and the number of display columns to display the characters in the class. All characters within a class possess the same char-
acteristics.
Class 0 consists of all 7-bit, single-byte ASCII characters. The most-significant bit of each of these characters is 0 (zero). Characters
in class 0 require one byte for encoding, and occupy one display column. These values are fixed for class 0 (zero). The 7-bit US ASCII
code is the primary EUC code set, which is available to users without direct specification.
A class 1 code set is a supplementary EUC code set. Class 1 characters have an initial byte whose most-significant bit is 1. If character
classes 2 or 3 are to be used, this initial byte must not be the SS2 or SS3 character, as these designate character classes 2 and 3. Char-
acters in class 1 may require more than 1 byte for encoding, and may require more than 1 display column. The eucset command must be used
to set the characteristics for code set class 1.
Class 2 and 3 code sets are supplementary EUC code sets. Characters in these classes have an initial byte of SS2 or SS3, respectively.
They require more than 1 byte for encoding, and may require more than 1 display column. The eucset command must be used to set the charac-
teristics for code set classes 2 and 3.
The cswidth argument in the eucset command line is a character string that describes the character widths for code set classes 1 through 3.
The string is of the following format: X1[:Y1], X2[:Y2], X3[:Y3]
The value X1 is the number of bytes required to encode a character in code set class 1. Y1 is the number of display columns needed to dis-
play characters in this class. X2 is the number of bytes required to encode a character in code set class 2, not counting the SS2 byte,
and Y2 is the number of display columns for code set class 2 characters. X3 is the number of bytes needed to encode characters in code set
class 3, not counting the SS3 byte, and Y3 is the number of display columns required for these characters. The values for the column widths
can be omitted if they are equal to the number of encoding bytes. If the encoding value of any of the EUC code sets is set to 0 (zero),
this indicates that the code set does not exist.
If no cswidth argument is supplied, the eucset command uses the value of the CSWIDTH environment variable. If this variable is not present,
the default string 1:1,0:0,0:0 is substituted. This default string designates that the environment uses a single-byte EUC code set that has
characters in the EUC code set class 1 format. If the environment uses a multibyte EUC code set in the code set class 1 format, single- or
multibyte EUC code sets in the code set class 2 or 3 format, or both, the default setting cannot be used.
DIAGNOSTICS
Your standard input is not an interactive terminal. The maximum character width of 8 was exceeded.
EXAMPLES
To display the encoding and display widths for the EUC code set classes 1-3 in your environment, enter: eucset -p To change the current
settings of the encoding and display widths for the EUC characters in code set classes 1 and 2 to 2 bytes each, enter: eucset 2:2,2:2,0:0
or eucset 2,2,0
SEE ALSO
Interfaces: eucioctl(7)eucset(1)