Converting unstructured data to structured data Post: 302975060

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting HTML data into a spreadsheet

Hi, I have a perl script that prints some data in the form of a table (HTML table) Now, I want to be able to convert this data into a report on an Excel sheet. How can I do this? Regards, Garric

2. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Hi, Could anyone help me in changing a tabular format output to comma seperated file pls in K-sh. Its very urgent. E.g : username empid ------------------------ sri 123 to username,empid sri,123 Thanks, Hema:confused:

3. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18...

4. Shell Programming and Scripting

Help converting row data to columns

I've been trying to figure this out for a while but I'm completely stumped. I have files with data in rows and I need to convert the data to columns. Each record contains four rows with a "field name: value" pair. I would like to convert it to four columns with the field names as column headers...

5. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = � é = � I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt"

6. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51...

7. Shell Programming and Scripting

[SOLVED] Converting data from one format to the other

Hi All, I need to convert an exel spreadsheet into a SAS dataset, and the following format change is needed. Please help, this is too complex for a biologist. Let me describe the input. 1st row is generation.1st column in keyword 'generation', starting 2nd column there are 5...

8. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17

9. Shell Programming and Scripting

Converting data from specific columns

i have a file (csv or txt or anything which has 4 columns (id,name,number,location) and it contains data. i want to convert the data of specific columns like name to ooooo and number to 88888 matching the field length of that columns. for example if name column has anthony which is 7, it should...

10. UNIX for Beginners Questions & Answers

Data extraction and converting into .csv file.

Hi All, I have a data file and need to extract and convert it into csv format: 1) Read and extract the line containing string ending with "----" (file sample_linebyline.txt file) and to make a .csv file from this. 2) To read the flat file flatfile_sample.txt which consists of similar data (...

LEARN ABOUT HPUX

wdutil

wdutil(1)						      General Commands Manual							 wdutil(1)

NAME

       wdutil - manipulate Native Language I/O word dictionary

SYNOPSIS

       wdutil [ -c | -i[kcap][,dcap] | -jjfile ] file
       wdutil [ -pd[desig] | -pk[desig] ] file
       wdutil [ -sd[[+|-]val] | -sk[[+|-]val] ] file
       wdutil [ -ud | -uk | -ut ] file
       wdutil [ -d | -l ] file

DESCRIPTION

       wdutil  is used to manipulate the word dictionary used by Native Language I/O for phrase and word conversion.  The word dictionary consists
       of a key entries block and a data entries block.  The key entries block holds the designations, and the data entries block holds the  words
       corresponding  to  each	designation.  wdutil also functions as a filter for transforming a word dictionary to a text file, and vice versa.
       See the Text File section for the layout of a text file.

       wdutil recognizes one of the options below.  If no option is specified and the file is a valid word dictionary, the capacity of the key and
       data entries blocks in the file is displayed.  Otherwise, an error message is printed.

       The capacity of the key entries block determines the maximum number of designations.  The capacity of the data entries block determines the
       maximum number of words.

   Options
       -c	      Condense the data entries block in the file to obtain a larger contiguous free area.  If the format version of the  file	is
		      old, it is updated.

       -i[kcap][,dcap]
		      Initialize  the  file  as  a  word  dictionary which has key entries block capacity specified by kcap and data entries block
		      capacity specified by dcap.  If the file does not exist, it is created.  The default values are 499 for  kcap  and  650  for
		      dcap.

       -jjfile	      Join  the  dictionary  jfile into the file.  The capacity of the resulting file is the sum of the capacities of the original
		      file and the jfile.

       -pk[desig]     Display the designations in the order of their code value.  If desig ends with  *,  designations	starting  with	desig  are
		      printed.	If desig is * or omitted, all designations in the file are printed.

       -pd[desig]     Display  the  designations  and  their corresponding words and part of speeches.	The string desig has the same format as in
		      -pk.

       -sd[[+|-]val]  Change the capacity of the data entries block in the file.  If + or - precedes val, the  current	value  is  incremented	or
		      decremented by val.  Otherwise, the capacity is changed to val.  The default value for val is 650.

       -sk[[+|-]val]  Change the capacity of the key entries block in the file.  The number val has the same format as in -sd option.  The default
		      value for val is 499.

       -ud	      Display the capacity and usage of the data entries block, and the size of contiguous free area.

       -uk	      Display the capacity and usage of the key entries block.

       -ut	      Display the capacity and usage of both the key and data entries blocks, and the size of contiguous free  area  of  the  data
		      entries block.

       -d	      Read  a  word dictionary, transform it into text form, and dump it to the standard output.  If the word includes a character
		      whose code is undefined in $LANG code set, its internal code is dumped in hexadecimal notation.

       -l	      Load the entry lines in text form from the standard input into the  specified  word  dictionary.	 If  specified	dictionary
		      exists,  wdutil overwrites it with loaded entry lines; otherwise wdutil creates a new one containing them.  If an entry line
		      is invalid, it is rejected and an error message is displayed on the standard error.

   Text File
       Each entry line in the text file consists of the following fields terminated by 
.  White space can be used as field separator.   The  3rd
       field is effective only if LANG=japanese, japanese.euc, ja_JP.SJIS, or ja_JP.eucJP

	      designation   word   hinshi(part of speech)

       designation
	      Consists	of  up to sixteen characters excluding special characters.  However, after being transformed by the -d option, all charac-
	      ters in designation are 2-byte characters in a text file.

       word   The word corresponding to designation consists of up to 50 bytes of multi-byte characters. The word may  have  hexadecimal  notation
	      instead of multi-byte characters. For example, the hexadecimal notation 'x7e7e' is recognized as a character whose internal code is
	      0x7e7e.

       hinshi Specify a part of speech which is one of noun, sa-hen verb, surname, personal name, and address.	 Filling  conventions  are  FUTSU-
	      UMEISHI(or  simply  MEISHI),  SAHENDOUSHI(or simply SAHEN), SEI, MEI and CHIMEI in kanji character.  If nothing is specified, wdutil
	      sets it FUTSUUMEISHI automatically.

EXTERNAL INFLUENCES

   International Code Set Support
       Single byte and multibyte character code sets are supported.

WARNINGS

       The smallest prime number not smaller than the given value is used as the capacity of a key entries block.  However, if the given value	is
       smaller than 5, 5 is used.

       Voiced plosive or non_voiced plosive in a designation is counted as 1 character in a text file.

       User dictionaries with old format version are supported on HP-UX 10.0, but they will not be supported in the future. To update them, use -c
       option:

	      $ wdutil -c file

AUTHOR

       wdutil was developed by HP.

																	 wdutil(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Converting HTML data into a spreadsheet

Discussion started by: garric

2. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Discussion started by: Hemamalini

3. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

Discussion started by: patrick87

4. Shell Programming and Scripting

Help converting row data to columns

Discussion started by: happy_ee