Extracting content of a file Post: 302447185

Sponsored Content

Top Forums Shell Programming and Scripting Extracting content of a file Post 302447185 by l20N1N on Saturday 21st of August 2010 07:19:58 PM

08-21-2010

Registered User

Extracting content of a file

Hello, I'm working on a script to extract the contents of a file (in general, plain txt file with numbers, symbols, and letters) and output it into a .txt file. but it is kind of all over the place. It needs to not include duplicates and the content has to be readable. I jumped all over the place as far as learning scripting but I managed to get down the translate feature. kind of new to awk but i heard it can be more effective and works similar. I was also wondering if im just making something more complicated when sort & uniq might be able to do the job?

Note: I will be using this script numerous times. Is it possible to keep updating the output file so that the context is extracted collectively?

My logic of the script so far is

1.read (while loop maybe?)
2.sort/uniq -c (to eliminate duplicates)
3.awk (to eliminate gibberish?)

> filename.txt

my code so far:

Code:

#!/bin/bash
# Check for input file on command line.
ARGS=1
E_BADARGS=65
E_NOFILE=66

if [ $# -ne "$ARGS" ]  # Correct number of arguments passed to script or too complicated for something easy?
then
  echo "Usage: `basename $0` filename"
  exit $E_BADARGS
fi

if [ ! -f "$1" ]       # Check if file exists.
then
  echo "File \"$1\" does not exist."
  exit $E_NOFILE
fi


#so far i have it set to translate output by feeding tr back to itself. will this work?
#or is awk more effective. what about the use of | sort | uniq -c?

tr A-Z a-z | tr '[:space:]' Z | \
tr -cs '[:alpha:]' Z | tr -s '\173-\377' Z | tr Z ' '` 

# for or while loop?

> output.txt 


exit 0

Last edited by l20N1N; 08-21-2010 at 09:09 PM.. Reason: corrections

l20N1N

View Public Profile for l20N1N

Find all posts by l20N1N

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data from text file based on configuration set in config file

Hi , a:) i have configuration file with pattren <Range start no>,<Range end no>,<type of records to be extracted from the data file>,<name of the file to store output> eg: myfile.confg 9899000000,9899999999,DATA,b.dat 9899000000,9899999999,SMS,a.dat b:) Stucture of my data file is...

2. Shell Programming and Scripting

How to read the content of the particular file from tar.Z without extracting?

Hi All, I want to read the content of the particular file from tar.Z without extracting. aaa.tar.Z contains a file called one.txt, I want to read the content of the one.txt without extracting. Please help me to read the content of it. Regards, Kalai.

3. Shell Programming and Scripting

Parsing file, yaml file? Extracting specific sections

Here is a data file, which I believe is in YAML. I am trying to retrieve just the 'addon_domains" section, which doesnt seem to be as easy as I had originally thought. Any help on this would be greatly appreciated!! I have been trying to do this in awk and mostly bash scripting instead of perl...

4. Shell Programming and Scripting

Need help with awk - how to read a content of a file from every file from file list

Hi Experts. I need to list the file and the filename comes from the file ListOfFile.txt. Basicly I have a filename "ListOfFile.txt" and it contain Example of ListOfFile.txt /home/Dave/Program/Tran1.P /home/Dave/Program/Tran2.P /home/Dave/Program/Tran3.P /home/Dave/Program/Tran4.P...

5. Shell Programming and Scripting

Extracting content from a file in specific format

Hi All, I have the file in this format **** Results Data **** Time or Step 1 2 20 0.000000000e+00 0s 0s 0s 1.024000000e+00 Us 0s 0s 1.100000000e+00 1s 0s 0s 1.100000001e+00 1s 0s 1s 2.024000000e+00 Us Us 1s 2.024000001e+00 ...

6. Shell Programming and Scripting

Sed: replace content from file with the content from file

Hi, I am having trouble while using 'sed' with reading files. Please help. I have 3 files. File A, file B and file C. I want to find content of file B in file A and replace it by content in file C. Thanks a lot!! Here is a sample of my question. e.g. (file A: a.txt; file B: b.txt; file...

7. Shell Programming and Scripting

Extracting content from xml file

Hello All, Hope you are doing well!!!!! I have a small code in the below format in xml file: <UML:ModelElement.taggedValue> <UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/> <UML:TaggedValue tag="documentation" value="This...

8. Shell Programming and Scripting

How to remove exisiting file content from a file and have to append new file content?

hi all, i had the below script x=`cat input.txt |wc -1` awk 'NR>1 && NR<'$x' ' input.txt > output.txt by using above script i am able to remove the head and tail part from the input file and able to append the output to the output.txt but if i run it for second time the output is...

9. Shell Programming and Scripting

Insert content of a file to another file at a line number which is given by third file

Hi friends, here is my problem. I have three files like this.. cat file1.txt ======= unix is best unix is best linux is best unix is best linux is best linux is best unix is best unix is best cat file2.txt ======== Windows performs better Mac OS performs better Windows...

10. UNIX for Beginners Questions & Answers

Linux read specific content file from tar.gz files without extracting

hello i wish to write the result of these below conditions in a file: 1. in a specific folder, i have many tar.gz files. 2. each tar.gz file contains ".dat" file in sub folders. 3. i wish to get the full path of these .dat files, if i find in it a specific word ("ERROR24"). 4. all this...

LEARN ABOUT MINIX

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [-an] [-e s] [-o list] [-tc] file1 file2

DESCRIPTION

       Join  forms,  on the standard output, a join of the two relations specified by the lines of file1 and file2.  If file1 is `-', the standard
       input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Fields are normally separated by blank, tab or newline.	In this case, multiple separators count as one, and leading  separators  are  dis-
       carded.

       These options are recognized:

       -an    In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -e s   Replace empty output fields by string s.

       -o list
	      Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a
	      field number.

       -tc    Use character c as a separator (tab character).  Every appearance of c in a line is significant.

SEE ALSO

       sort(1), comm(1), awk(1).

BUGS

       With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.

       The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.

7th Edition							  April 29, 1985							   JOIN(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data from text file based on configuration set in config file

Discussion started by: suparnbector

2. Shell Programming and Scripting

How to read the content of the particular file from tar.Z without extracting?

Discussion started by: kalpeer

3. Shell Programming and Scripting

Parsing file, yaml file? Extracting specific sections

Discussion started by: Rhije

4. Shell Programming and Scripting

Need help with awk - how to read a content of a file from every file from file list

Discussion started by: tanit