Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Extract strings from the file using awk s Post 303045844 by Mannu2525 on Wednesday 15th of April 2020 10:37:19 AM
Old 04-15-2020
Extract strings from the file using awk s

Hi,

I'm trying to make a file that gives me useful information. Format of file is below:-

Code:
--- (Tue Apr 14 09:46:43 EDT 2020): JOIN_Constraints_Schema:test_simple_joins -------------
internal optimizer errors: 0
--- (Tue Apr 14 09:48:10 EDT 2020): JOIN_Constraints_Schema:test_constraint_joins_setop_dis_oby_lmt -------------
Number of queries that causes internal optimizer errors: 0
--- (Tue Apr 14 09:49:02 EDT 2020): in External_Table_Schema:test_subquery_in_from ---------------
--- (Tue Apr 14 09:49:10 EDT 2020): EventSeries_Schema:test_Event_Series1 -------------
--- (Tue Apr 14 09:49:17 EDT 2020):  Gosalesdw_Schema:test_complex_analytics -------------
--- (Tue Apr 14 09:49:25 EDT 2020):  GBY_Schema1:test_Groupby_Rollup -------------
internal optimizer errors: 0
--- (Tue Apr 14 09:49:40 EDT 2020):  GBY_Schema1:test_Groupby_GroupingSets -------------
internal optimizer errors: 0
--- (Tue Apr 14 09:49:52 EDT 2020):  GBY_Schema1:test_Groupby_Cube -------------
internal optimizer errors: 0
--- (Tue Apr 14 09:50:05 EDT 2020):  GBY_Schema1:test_gby -------------
internal optimizer errors: 0

I need to extract the text after the timestamp and immediately next line error number having internal optimizer error.

It should be like - newout.txt
Code:
JOIN_Constraints_Schema,test_simple_joins,0
JOIN_Constraints_Schema,test_constraint_joins_setop_dis_oby_lmt,0
GBY_Schema1,test_Groupby_Rollup,0
GBY_Schema1,test_Groupby_GroupingSets,0
GBY_Schema1,test_Groupby_Cube,0
GBY_Schema1,test_gby,0

Exception - If there is no internal optimizer error attached to any field given in timestamp then no need to print

I tried using a simple script but not works for me

Code:
cat newout.txt | while read x
do
        if [ $(echo $x | grep -E '20[0-9][0-9]\):' | wc -l) == 1 ]
        then
                 schema=$(echo $x | awk '{print $(NF-1)}' | awk -F ':' '{print $1}')
                 config=$(echo $x | awk '{print $(NF-1)}' | awk -F ':' '{print $2}')
         else
                 internal=$(echo $x | awk -F ':' '{print $2}')
         fi
         echo $schema,$config,$internal
done

Please help me achieve this.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

using AWK how to extract text between two same strings

I have a file like: myfile.txt it is easy to learn awk and begin awk scripting and awk has got many features awk is a powerful text processing tool Now i want to get the text between first awk and immediate awk not the third awk . How to get it ? its urgent pls help me and file is unevenly... (2 Replies)
Discussion started by: santosh1234
2 Replies

2. Shell Programming and Scripting

using awk to extract text between two constant strings

Hi, I have a file from which i need to extract data between two constant strings. The data looks like this : Line 1 SUN> read db @cmpd unit 60 Line 2 Parameter: CMPD -> "C00071" Line 3 Line 4 SUN> generate Line 5 tabint>ERROR: (Variable data) The data i need to extract is... (11 Replies)
Discussion started by: mjoshi
11 Replies

3. UNIX for Dummies Questions & Answers

Using awk/sed to extract text between Strings

Dear Unix Gurus, I've got a data file with a few hundred lines (see truncated sample)... BEGIN_SCAN1 TASK_NAME=LA48 PDD Profiles PROGRAM=ArrayScan 1.00 21.220E+00 2.00 21.280E+00 END_DATA END_SCAN1 BEGIN_SCAN2 TASK_NAME=LA48 PDD Profiles 194.00 2.1870E+00 ... (5 Replies)
Discussion started by: tintin72
5 Replies

4. Shell Programming and Scripting

AWK: How to extract text lines between two strings

Hi. I have a text test1.txt file like:Receipt Line1 Line2 Line3 End Receipt Line4 Line5 Line6 Canceled Receipt Line7 Line8 Line9 End (9 Replies)
Discussion started by: TQ3
9 Replies

5. UNIX for Advanced & Expert Users

bash/grep/awk/sed: How to extract every appearance of text between two specific strings

I have a text wich looks like this: clid=2 cid=6 client_database_id=35 client_nickname=Peter client_type=0|clid=3 cid=22 client_database_id=57 client_nickname=Paul client_type=0|clid=5 cid=22 client_database_id=7 client_nickname=Mary client_type=0|clid=6 cid=22 client_database_id=6... (3 Replies)
Discussion started by: Pioneer1976
3 Replies

6. Shell Programming and Scripting

Extract strings within XML file between different delimiters

Good afternoon! I have an XML file from which I want to extract only certain elements contained within each line. The problem is that the format of each line is not exactly the same (though similiar). For example, oa_var will be in each line, however, there may be no value or other... (3 Replies)
Discussion started by: bab@faa
3 Replies

7. Shell Programming and Scripting

Extract strings from file - Help

Hi, I have a file say with following lines (the lines could start from any column and there can be many many create statements in the file) create table table1....table definition... insert into table1 values..... create or replace view view1....view definition.... What i want is to... (2 Replies)
Discussion started by: whoami191
2 Replies

8. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

9. Shell Programming and Scripting

awk extract strings matching multiple patterns

Hi, I wasn't quite sure how to title this one! Here goes: I have some already partially parsed log files, which I now need to extract info from. Because of the way they are originally and the fact they have been partially processed already, I can't make any assumptions on the number of... (8 Replies)
Discussion started by: chrissycc
8 Replies

10. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies
GETDATE(3)						   BSD Library Functions Manual 						GETDATE(3)

NAME
getdate -- convert user format date and time SYNOPSIS
#include <time.h> extern int getdate_err; struct tm * getdate(const char *string); DESCRIPTION
The getdate() function converts user-definable date and/or time specifications pointed to by string to a tm structure. The tm structure is defined in the <time.h> header. User-supplied templates are used to parse and interpret the input string. The templates are text files created by the user and identified via the environment variable DATEMSK. Each line in the template represents an acceptable date and/or time specification using conversion specifications similar to those used by strftime(3) and strptime(3). Dates before 1902 and after 2037 are illegal. The first line in the template that matches the input specification is used for interpretation and conversion into the internal time format. Conversion Specifications The following conversion specifications are supported: %% Same as %. %a Locale's abbreviated weekday name. %A Locale's full weekday name. %b Locale's abbreviated month name. %B Locale's full month name. %c Locale's appropriate date and time representation. %C Century number (the year divided by 100 and truncated to an integer as a decimal number [1,99]); single digits are preceded by 0. If used without the %y specifier, this format specifier will assume the current year offset in whichever century is specified. The only valid years are between 1902-2037. %d day of month [01,31]; leading zero is permitted but not required. %D Date as %m/%d/%y. %e Same as %d. %h Locale's abbreviated month name. %H Hour (24-hour clock) [0,23]; leading zero is permitted but not required. %I Hour (12-hour clock) [1,12]; leading zero is permitted but not required. %j Day number of the year [1,366]; leading zeros are permitted but not required. %m Month number [1,12]; leading zero is permitted but not required. %M Minute [0,59]; leading zero is permitted but not required. %n Any white space. %p Locale's equivalent of either a.m. or p.m. %r Appropriate time representation in the 12-hour clock format with %p. %R Time as %H:%M. %S Seconds [0,61]; leading zero is permitted but not required. The range of values is [00,61] rather than [00,59] to allow for the occa- sional leap second and even more occasional double leap second. %t Any white space. %T Time as %H:%M:%S. %U Week number of the year as a decimal number [0,53], with Sunday as the first day of the week; leading zero is permitted but not required. %w Weekday as a decimal number [0,6], with 0 representing Sunday. %W Week number of the year as a decimal number [0,53], with Monday as the first day of the week; leading zero is permitted but not required. %x Locale's appropriate date representation. %X Locale's appropriate time representation. %y Year within century. When a century is not otherwise specified, values in the range 69-99 refer to years in the twentieth century (1969 to 1999 inclusive); values in the range 00-68 refer to years in the twenty-first century (2000 to 2068 inclusive). %Y Year, including the century (for example, 1993). %Z Time zone name or no characters if no time zone exists. Modified Conversion Specifications Some conversion specifications can be modified by the E and O modifier characters to indicate that an alternative format or specification should be used rather than the one normally used by the unmodified specification. If the alternative format or specification does not exist in the current locale, the behavior be as if the unmodified conversion specification were used. %Ec Locale's alternative appropriate date and time representation. %EC Name of the base year (period) in the locale's alternative representation. %Ex Locale's alternative date representation. %EX Locale's alternative time representation. %Ey Offset from %EC (year only) in the locale's alternative representation. %EY Full alternative year representation. %Od Day of the month using the locale's alternative numeric symbols; leading zeros are permitted but not required. %Oe Same as %Od. %OH Hour (24-hour clock) using the locale's alternative numeric symbols. %OI Hour (12-hour clock) using the locale's alternative numeric symbols. %Om Month using the locale's alternative numeric symbols. %OM Minutes using the locale's alternative numeric symbols. %OS Seconds using the locale's alternative numeric symbols. %OU Week number of the year (Sunday as the first day of the week) using the locale's alternative numeric symbols. %Ow Number of the weekday (Sunday=0) using the locale's alternative numeric symbols. %OW Week number of the year (Monday as the first day of the week) using the locale's alternative numeric symbols. %Oy Year (offset from %C) in the locale's alternative representation and using the locale's alternative numeric symbols. Internal Format Conversion The following rules are applied for converting the input specification into the internal format: o If only the weekday is given, today is assumed if the given day is equal to the current day and next week if it is less. o If only the month is given, the current month is assumed if the given month is equal to the current month and next year if it is less and no year is given. (The first day of month is assumed if no day is given.) o If only the year is given, the values of the tm_mon, tm_mday, tm_yday, tm_wday, and tm_isdst members of the returned tm structure are not specified. o If the century is given, but the year within the century is not given, the current year within the century is assumed. o If no hour, minute, and second are given, the current hour, minute, and second are assumed. o If no date is given, today is assumed if the given hour is greater than the current hour and tomorrow is assumed if it is less. General Specifications A conversion specification that is an ordinary character is executed by scanning the next character from the buffer. If the character scanned from the buffer differs from the one comprising the conversion specification, the specification fails, and the differing and subse- quent characters remain unscanned. A series of conversion specifications composed of '%n', '%t', white space characters, or any combination is executed by scanning up to the first character that is not white space (which remains unscanned), or until no more characters can be scanned. Any other conversion specification is executed by scanning characters until a character matching the next conversion specification is scanned, or until no more characters can be scanned. These characters, except the one matching the next conversion specification, are then compared to the locale values associated with the conversion specifier. If a match is found, values for the appropriate tm structure members are set to values corresponding to the locale information. If no match is found, getdate() fails and no more characters are scanned. The month names, weekday names, era names, and alternative numeric symbols can consist of any combination of upper and lower case letters. The user can request that the input date or time specification be in a specific language by setting the LC_TIME category using setlocale(3). RETURN VALUES
If successful, getdate() returns a pointer to a tm structure; otherwise, it returns NULL and sets the global variable getdate_err to indicate the error. Subsequent calls to getdate() alter the contents of getdate_err. The following is a complete list of the getdate_err settings and their meanings: 1 The DATEMSK environment variable is null or undefined. 2 The template file cannot be opened for reading. 3 Failed to get file status information. 4 The template file is not a regular file. 5 An error is encountered while reading the template file. 6 The malloc(3) function failed (not enough memory is available). 7 There is no line in the template that matches the input. 8 The input specification is invalid (for example, February 31). USAGE
The getdate() function makes explicit use of macros described on the ctype(3) manual page. EXAMPLES
Example 1: Examples of the getdate() function. The following example shows the possible contents of a template: %m %A %B %d %Y, %H:%M:%S %A %B %m/%d/%y %I %p %d,%m,%Y %H:%M at %A the %dst of %B in %Y run job at %I %p,%B %dnd %A den %d. %B %Y %H.%M Uhr The following are examples of valid input specifications for the above template: getdate("10/1/87 4 PM") getdate("Friday") getdate("Friday September 19 1987, 10:30:30") getdate("24,9,1986 10:30") getdate("at monday the 1st of december in 1986") getdate("run job at 3 PM, december 2nd") If the LANG environment variable is set to de (German), the following is valid: getdate("freitag den 10. oktober 1986 10.30 Uhr") Local time and date specification are also supported. The following examples show how local date and time specification can be defined in the template. +---------------------------+------------------+ | Invocation | Line in Template | |getdate("11/27/86") | %m/%d/%y | |getdate("27.11.86") | %d.%m.%y | |getdate("86-11-27") | %y-%m-%d | |getdate("Friday 12:00:00") | %A %H:%M:%S | +---------------------------+------------------+ The following examples illustrate the Internal Format Conversion rules. Assume that the current date is Mon Sep 22 12:19:47 EDT 1986 and the LANG environment variable is not set. +-------------+---------------+------------------------------+ | Input | Template Line | Date | |Mon | %a | Mon Sep 22 12:19:48 EDT 1986 | |Sun | %a | Sun Sep 28 12:19:49 EDT 1986 | |Fri | %a | Fri Sep 26 12:19:49 EDT 1986 | |September | %B | Mon Sep 1 12:19:49 EDT 1986 | |January | %B | Thu Jan 1 12:19:49 EST 1987 | |December | %B | Mon Dec 1 12:19:49 EDT 1986 | |Sep Mon | %b %a | Mon Sep 1 12:19:50 EDT 1986 | |Jan Fri | %b %a | Fri Jan 2 12:19:50 EST 1987 | |Dec Mon | %b %a | Mon Dec 1 12:19:50 EST 1986 | |Jan Wed 1989 | %b %a %Y | Wed Jan 4 12:19:51 EST 1989 | |Fri 9 | %a %H | Fri Sep 26 09:00:00 EDT 1986 | |Feb 10:30 | %b %H:%S | Sun Feb 1 10:00:30 EST 1987 | |10:30 | %H:%M | Tue Sep 23 10:30:00 EDT 1986 | |13:30 | %H:%M | Mon Sep 22 13:30:00 EDT 1986 | +-------------+---------------+------------------------------+ SEE ALSO
ctype(3), mktime(3), setlocale(3), strftime(3), strptime(3), environ(5) BSD
January 3, 2004 BSD
All times are GMT -4. The time now is 01:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy