Sponsored Content
Full Discussion: Compare 2 flat files
Top Forums Shell Programming and Scripting Compare 2 flat files Post 302383316 by clx on Tuesday 29th of December 2009 08:41:10 AM
Old 12-29-2009
awk has some limitation.
eg.
Code:
Number of fields per record	100
Characters per input record	3000
Characters per output record	3000
Characters per field	1024
Characters per printf string	3000
Characters in literal string	400
Characters in character class	400
Files open	15
Pipes open	1


though, gawk,mawk and other latest version are the alternatives for these limitations.

reference - Orelly - sed & awk ch 10.8 Limitations

just wanted to share this info.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Flat Files

I have a flat file like this 0001 THER ULT HEAD & NECK VES 0002 THER ULTRASOUND OF HEART 0003 THER ULT PERIPHERAL VES 0009 OTHER THERAPEUTIC ULTSND 0010 IMPLANT CHEMOTHERA AGENT 0011 INFUS DROTRECOGIN ALFA 0012 ADM INHAL NITRIC OXIDE I need to conver this to a comma delimited flat file... (2 Replies)
Discussion started by: thumsup9
2 Replies

2. Shell Programming and Scripting

How to compare data in two flat files and update them?

Hi All, I am giving an example similar to the problem I have. I have two data files of 10 columns each in which fields are delimited by comma(,). I need to compare compare the two files using the uniq col(col3). If there are any records in file1 and are not in file2 then I have check the value... (3 Replies)
Discussion started by: rajus19
3 Replies

3. Shell Programming and Scripting

How to compare two flat files and get changed data

Hi, I need to compare two flat files (yesterday & today's data) and get only the changed data from flat files. In flat file i dont have data column or anything its just a string data in flat file.Can any one please let me know the script With Regds Shashi (3 Replies)
Discussion started by: jtshashidhar
3 Replies

4. Shell Programming and Scripting

Compare 2 flat files

Hi Gurus, I searched the forum but didnt get much info. I want to compare 2 files. 1)Newfile comes today with 2)Old file of previous day. The files are same ,just the new files might have new records sometimes. So I want to capture these new records in another file. Can anyone help... (5 Replies)
Discussion started by: ganesh123
5 Replies

5. Programming

compare XML/flat file with UNIX file system structure

Before i start doing something, I wanted to know whether the approach to compare XML file with UNIX file system structure. I have a pre-configured file(contains a list of paths to executables) and i need to check against the UNIX directory structure. what are the various approches should i use ? I... (6 Replies)
Discussion started by: shafi2all
6 Replies

6. Shell Programming and Scripting

awk to compare flat files and print output to another file

Hello, I am strugling from quite a some time to compare flat files with over 1 million records could anyone please help me. I want to compare two pipe delimited flat files, file1 with file2 and output the unmatched rows from file2 in file3 Sample File1: ... (9 Replies)
Discussion started by: suhaeb
9 Replies

7. Shell Programming and Scripting

Require compare command to compare 4 files

I have four files, I need to compare these files together. As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes. Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies

8. UNIX for Dummies Questions & Answers

Compare two flat files and update one based on the values in the other

Hi, I'm a newbie to scripting and am trying to compare two files using awk. The files are exactly the same dimensions. Where the first file has 0's I would like to create an updated version of the second file which has the corresponding elements set to zero also. eg: file1: 12345 1 2 0... (3 Replies)
Discussion started by: kasan0
3 Replies

9. Shell Programming and Scripting

Compare to flat files using awk

compare to flat files using awk .but in 4th field contains non ordered substring. how to do that. file1.txt john|0.0|4|**:25;JP:50;UY:25 file2.txt andy|0.0|4|JP:50;**:25;UY:25 (4 Replies)
Discussion started by: veeruasu
4 Replies

10. Shell Programming and Scripting

Converting Multiline Files to Flat Files?

How to convert this: F1-R1 F1-R2 F1-R3 into a flat file for bash?? Each record F2-R1 F2-R2 F2-R3 F3-R1 F3-R2 F3-R3 F4-R1 F4-R2 F4-R3is on one line with all fields for that record, put into an output file. The output file should look like this when converted: F1-R1,F2-R1,F3-R1,F4-R1... (6 Replies)
Discussion started by: bud1738
6 Replies
Wototo(5)							File Formats Manual							 Wototo(5)

NAME
Wototo, wototo - Introduction to the Thai language standard DESCRIPTION
Wototo is the Thai language software standard. It describes Thai characters and their classifications. This standard also describes the methods used to input and output Thai characters. Thai Character Sets The following two character sets are defined for the Thai language: Basic character set Auxiliary character set In the basic character set, characters are 8-bit coded and have values from 0 to 255. Character values correspond to the characters defined in standards as follows: Values 0 to 7F correspond to characters from the ISO 646-1983 standard. Values A1 to FB (except for DB, DD and DE) correspond to characters from the TIS 620-2533 standard. Remaining values are reserved for future use. The encoded form of the basic character set is called the the TACTIS codeset, which is discussed in the TACTIS(5) reference page. Characters in the auxiliary character set use the code values 32 to 126 and 161 to 254 only. The Wototo standard specifies that implementa- tions provide at least one auxiliary character set. Character Classification In the TACTIS codeset, characters are organized into different classes. This classification is done only to facilitate processing is not related to Thai linguistic or grammatical rules. The codeset contains the following character classes: Nondisplayable characters that are used for controlling output or data communication. The sixty-six control character values are: 00 to 1F, 7F, 80 to 9F, and FF. The Thai consonants as defined in TIS 620-2533. The five leading vowels as defined in TIS 620-2533. The six following vowels as defined in TIS 620-2533. The two below vowels as defined in TIS 620-2533. The five above vowels as defined in TIS 620-2533. The four tone marks as defined in TIS 620-2533. The four above diacritics as defined in TIS 620-2533. The below diacritic as defined in TIS 620-2533. Those characters that do not fit into preceding five character classes. This group includes 119 characters that users cannot compose with above vowels, below vowels, tone marks, and above and below diacritics. Non-composible characters are divided into the following seven groups: Graphic Characters The 94 graphic defined in ISO 646-1983. These include: 52 English alphabetic characters 10 digits 32 special characters whose values are 21 to 2F, 3A to 3F, and 7B to 7E Space Character code value is 20. Nobreak space Character code value is A0. Thai digits The 10 Thai digits as defined in TIS 620-2533. Thai special characters The 6 Thai special characters as defined in TIS 620-2533. Word separator The word separator as defined in TIS 620-2533. Reserved code points 6 code points reserved for future use. To better describe Thai input and output methods, characters in the classes FV, BV, AV, and AD are further divided into subclasses. The following list describes character classes and subclasses by the number of characters in the class and their encoded values: Number: 66 Values: 00 to 1F, 7F, 80 to 9F, and FF Number: 119 Values: 20 to 7E (ISO 646-1983 character codes) A0, CF, DC, DF, E6, EF, F0 to F9, FA, and FB (TIS 620-2533 character codes) DB, DD, DE FC, FD, and FE (Reserved code points) Number: 44 Values: A1 to C3, C5, and C7 to CE Number: 5 Values: E0, E1, E2, E3, and E4 Number: 3 Values: D0, D2, and D3 Number: 1 Value: E5 Number: 2 Values: C4 and C6 These two characters also behave as leading vowels (LV) in the character sequence LV+CONS. Number: 1 Value: D8 Number: 1 Value: D9 Number: 1 Value: DA Number: 4 Values: E8, E9, EA, and EB Number: 2 Values: ED and EC Number: 1 Value: E7 Number: 1 Value: EE Number: 1 Value: D4 Number: 2 Values: D1 and D6 Number: 2 Values: D5 and D7 Character Levels Thai characters are classified according to different display levels (relative to baseline and nondisplayable). Classification by display levels facilitates the character input procedures. There are five character classification levels. Four levels include displayable charac- ters and one level includes nondisplayable characters, as follows: Nondisplayable level Includes all control characters in the CTRL class. Base level Includes all characters in the NON, CONS, FV, and LV classes. Characters at this level are drawn on baseline. Above level Includes all characters in the AD3, AV1, AV2, and AV3 classes. Characters at this level are drawn immediately above final conso- nants. Below level Includes all characters in the BV1, BV2, and BD classes. Characters at this level are drawn immediately below final consonants. Top level Includes all characters in the TONE, AD1, and AD2 classes. Characters at this level are drawn on top of the characters at the above level. If above level characters do not exist, top level characters are drawn at the above level. Characters at this level also indicate the end of character cells. The standard specifies that the properties of Thai characters can be tested by using the following functions. Note These functions are not implemented in Tru64 UNIX. Determines the character level class that the character belongs to and returns the numeric value 0, 1, 2, 3, or 4. These return values can be represented by the constants NONDISP, TOP, ABOVE, BASE, or BELOW, respectively. Returns TRUE if a character is alphabetic. Returns TRUE if a character is either alphabetic or a digit. Returns TRUE if a character belongs to the CTRL class. Returns TRUE if the character is a digit. Returns TRUE if the character is not in the NONDISP level class. Returns TRUE if the character is an English lowercase letter (a to z). Returns TRUE if the character is an English uppercase letter (A to Z). Returns TRUE if a character is not in the NONDISP level class. Returns TRUE if the character is a space, formfeed, newline, return, tab, vertical tab, or wordbreak character. Returns TRUE if the character is a hexadecimal digit 0 to 9, A to F, or a to f. (Thai digits are excluded.) Thai Input Methods The input method for Thai characters directly maps characters to keys, as for English. Thai character sequences are entered character by character and display from left to right, regardless of whether the sequence includes forward characters (characters in the NON, CONS, LV, FV1, FV2, FV3 classes) or dead characters (characters in all other classes). However, the following basic rules apply to the character input sequence: Every display cell must begin with a character on the baseline (in the BASE class). A character in the BASE class that is also in the CONS class may be followed by an above vowel, a below vowel, a tone mark, a below diacritic, or an above diacritic. For more detailed rules about input sequence rules, refer to the Draft Industrial Standard - Thai Language Software Standard WTT2.0 (Part 2: Thai Input and Output Methods) SEE ALSO
Commands: locale(1) Others: i18n_intro(5), i18n_printing(5), l10n_intro(5), TACTIS(5), Thai(5) Wototo(5)
All times are GMT -4. The time now is 10:06 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy