Sponsored Content
Top Forums UNIX for Dummies Questions & Answers change field separator only from nth field until NF Post 302688135 by beca123456 on Friday 17th of August 2012 09:47:35 PM
Old 08-17-2012
change field separator only from nth field until NF

Hi !

input:
Code:
111|222|333|aaa|bbb|ccc
999|888|777|nnn|kkk
444|666|555|eee|ttt|ooo|ppp

With awk, I am trying to change the FS "|" to "; " only from the 4th field until the end (the number of fields vary between records).

In order to get:
Code:
111|222|333|aaa; bbb; ccc
999|888|777|nnn; kkk
444|666|555|eee; ttt; ooo; ppp

I tried something like:
Code:
gawk 'BEGIN{FS=OFS="|"}{for(i=5; i<=NF; i++) $4 = $4 ($4?"; ":"")$i}1' input

It works for the 4th field, but it also prints the original fields from $5 to $NF after:
Code:
111|222|333|aaa; bbb; ccc|bbb|ccc
999|888|777|nnn; kkk|kkk
444|666|555|eee; ttt; ooo; ppp|ttt|ooo|ppp

Is there any way to to create an array from $4 until $NF without knowing the number of fields (a[$4...$NF])???

Thanks for your help !
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Change field separator of grep from : to space

Hi, All, I wonder how to change the field separator of grep from : to space when use grep to display. for example when I use grep -e 'pattern1' -e 'pattern2' -n filename to find patterns, it use : to separate patterns, but I want to use blank space. is there an option I can set to... (2 Replies)
Discussion started by: Jenny.palmy
2 Replies

2. Shell Programming and Scripting

Field separator Ques.

Hello... Im trying to use "- " as field separator... I used awk -F"- " '{print $3}' input_file ... but it's not working, it assumes that the field separator is "-" and not "- " ... Any ideas ?? :( Thanks (6 Replies)
Discussion started by: yahyaaa
6 Replies

3. Shell Programming and Scripting

field separator in Perl

is there a similar parameter you can set in perl like FS in awk? I think I've read all the tutorials on the subject, but cannot get this map split and so on thing to work. I need to sort a file by columns, eg. first, third, fifth... The script I need to add this column sorting is this: use... (38 Replies)
Discussion started by: ahsog
38 Replies

4. Shell Programming and Scripting

dynamically change awk Field Separator FS

Hi All, I was wondering if anyone knew how to dynamically change the FS in awk to accept vairiable containing a field separator. the current code is as below and does not work when i introduce the dynamic FS change :-( validate_source_file() { source_file=$1 ... (2 Replies)
Discussion started by: satnamx
2 Replies

5. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Hi, all I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes. sample input: for this line, 5 fields are supposed to be extracted, they... (8 Replies)
Discussion started by: kevintse
8 Replies

6. Shell Programming and Scripting

Field separator X'1F'

Hi, I have a flat file with fields separated by a X'1F' i have to fetch 4th field from second line. please help me how to achieve it. I tried with below command and its not working. cut -f4 -d`echo -e '\x1f'` filename.txt I am using SunOS. Thanks in advance. (2 Replies)
Discussion started by: rohan10k
2 Replies

7. Shell Programming and Scripting

awk field separator help -

Hi Experts , file : - How to construct the awk filed separator so that $1, $2 $3 , can be assigned to the each "" range. I am trying : awk -F"]" '{print $1}' but it is printing the entire file. Not first field. The desired output needed for first field... (9 Replies)
Discussion started by: rveri
9 Replies

8. Shell Programming and Scripting

Replace a value of Nth field of nth row

Using Awk, how can I achieve the following? I have set of record numbers, for which, I have to replace the nth field with some values, say spaces. Eg: Set of Records : 4,9,10,55,89,etc I have to change the 8th field of all the above set of records to spaces (10 spaces). Its a delimited... (1 Reply)
Discussion started by: deepakwins
1 Replies

9. Shell Programming and Scripting

Field separator

Hello All, I have a file, but I want to separate the file at a particular record with comma"," in the line Input file APPLE6SSAMSUNGS5PRICEPERPIECEDOLLAR600EACH010020340URX581949695US to Output file APPLE6S,SAMSUNGS5,PRICEPERPIECE,DOLLAR600EACH,010020340URX581949695,US This is for... (11 Replies)
Discussion started by: m6248m
11 Replies

10. Shell Programming and Scripting

Inserting a field without disturbing field separator on other fields

Hi All, I have the input as below: cat input 032016002 2.891 97.109 16.605 27.172 24.017 32.207 0.233 0.021 39.810 0.077 0.026 19.644 13.882 0.131 11.646 0.102 11.449 76.265 23.735 16.991 83.009 8.840 91.160 0.020 99.980 52.102 47.898 44.004 55.996 39.963 18.625 0.121 1.126 40.189... (15 Replies)
Discussion started by: am24
15 Replies
DtSearchQuery(library call)											       DtSearchQuery(library call)

NAME
DtSearchQuery -- Perform a DtSearch database search for a specified query SYNOPSIS
#include <Dt/Search.h> int DtSearchQuery( void *qry, char *dbname, int search_type, char *date1, char *date2, DtSrResult **results, long *resultscount, char *stems, int *stemcount); DESCRIPTION
DtSearchQuery is the DtSearch API search function. DtSearchQuery is passed a query string and some search options, performs the requested search, and if successful returns a linked list of DtSrResult structures representing the documents satisfying the search. The results list contains information about the documents that can be used for subsequent retrievals, as well as information suitable for display to an end user. Search Types DtSearchQuery supports three types of searches: P, W, and S. Type P Search Query Strings Query strings for search type P have the simplest syntax, namely a sequence of words separated by ASCII whitespace. Punctuation and invalid words are silently discarded by the search engine. The only possible syntax error is that all query words happen to be invalid in the lan- guage of the database. Search type P is often used to implement a limited Query-by-Example (QBE) search paradigm. In this scenario, users typically paste document text from whatever source into a query string text field. Their expectation is that the search engine will return the documents in the database that are "most similar" to the text of the query string, and the statistical sort of the results list usually satisfies that expectation. Note that although search type P does not use boolean syntax, it is actually implemented as a stemmed search (type S search) with implied boolean ORs between words. Types S and W Boolean Query Strings Query strings for search types S (stemmed boolean) and W (exact word boolean) must be syntactically valid boolean expressions as described below. Any string that does not match a valid expression rule is invalid and will fail with an error message. Query words for all search types may be entered in any codeset for a supported DtSearch language, including multibyte languages. Words may be identified as invalid by the language module of the database for a number of reasons including any words that would not have been indexed because they are too short, too long, on the stop list, etc. With one exception, linguistically invalid words result in a syntax error. The exception is in the case of an "all ANDs" query, where invalid words and valid words that happen not to be in the database are silently erased from the query string. The boolean query operators are the ASCII metacharacters: '&' for AND, '|' for OR, '~' for NOT, '(' and ')' for open and close parentheses respectively, and '@ nnn' for collocation expressions. All expression tokens are separated by ASCII whitespace. Typically this i 1 or more space or tab characters. Omitting whitespace separators is legal if it can be done unambiguously. For example "word1&word2" is a legal expression but "word1word2" would be interpreted as a single word token. The ASCII "at" sign (@) marks a special boolean collocation operator. The collocation operator has the syntax "@n...", the ASCII "at" sign followed by one or more ASCII numeric digits, representing an integer with value greater than zero. Collocation is a variation of the AND search where a user can specify the maximum distance in bytes between any two words. In most languages a byte is equivalent to a character position. For example to find "ice" and "cream" separated by no more than five characters, the search query "ice @5 cream" may be used. Unlike other boolean operators, the collocation operator can apply only to naked word tokens, not other expressions. Searches including collocation operators are slower than searches without them, and can be much slower for common words. There are a maximum of 8 distinct word tokens. Collocation operators count as part of the 8. There is no limit to the number of operators, as long as they match the syntax rules. Note: Collocation operators are only supported for "Austext flavor" databases. The default flavor of database created by dtsrcreate is "Dtinfo flavor," which does not support collocation. Boolean Query Syntax Rules There are only 6 syntax rules and the rules are recursive. Ambiguity is resolved by precedence and associativity rules. 1. valid_expression := word_token A valid expression can be just a valid naked word token. Semantically, the expression returns all documents containing the speci- fied word. The word_token must be a valid word in the language of the database being searched. 2. valid_expression := valid_expression '&' valid_expression The ASCII ampersand character is the AND character. Semantically, it returns all documents satisfying both the first and second expressions (boolean intersection). AND is also the "implied" boolean operator in the following sense: the query parser will insert an ampersand between words or expressions that otherwise would be separated only by whitespace. For example "word1 word2" becomes "word1 & word2". 3. valid_expression := valid_expression '|' valid_expression The ASCII virgule (vertical slash) character is the OR character. It means return all documents satisfying either the first or the second expression (boolean union). 4. valid_expression := '(' valid_expression ')' Valid expressions may be recursively nested in ASCII open and close parentheses characters. The query parser "forgives" two com- mon human errors. It will automatically discard excessive close parentheses characters, and it will automatically generate close parentheses characters if necessary at the end of a query. For example, "aaa | (bbb & ccc)))))) ddd" becomes "aaa | ( bbb & ccc) & ddd", and "aaa ((bbbb" becomes "aaa ( ( bbb ) )". 5. valid_expression := '~' valid_expression The ASCII tilde character is the unary NOT operator. It returns every document in the database that is not in the set satisfying the expression. 6. valid_expression := word_token collocation_operator word_token Collocation operators are permitted only between words, not expressions. Each of the word tokens and the collocation operator itself occupy slots in the table of 8 maximum word tokens. Boolean Associativity and Precedence Table In order from highest precedence to lowest: Associativity Operator Example (none) COLLOC right NOT "aaa~bbb" resolved as "aaa & (~(bbb)" left AND "aaa bbb ccc" resolved as "(aaa & bbb) & ccc" left OR "aaa|bbb|ccc" resolved as "(aaa | bbb) | ccc" (none) naked word Example Boolean Queries aaa bbb ccc Returns all records that contain at least one occurrence of all three words. aaa | (bbb ~ccc) Retrieves all records containing "aaa" and also all records containing "bbb", but not "ccc". aaa ~(aaa @1 bbb) Returns all records containing "aaa" but omits those where "aaa" is one character away from "bbb". It is possible to formulate a query that requires retrieving all records in the database that contain none of the query words (for example, ~aaa. Users should be warned that in a large database such a search can take a very long time. Using the implied associativity and precedence rules, the ambiguous query string aaa ~bbb | ccc ~ddd @10 eee is disambiguated as (aaa & (~bbb)) | (ccc & (~(ddd @10 eee))). ARGUMENTS
search_type Specifies the type of search to perform. Valid values are P, W, and S. Search type P indicates that the query string is a sequence of words separated by ASCII whitespace. It requests that the words be stemmed prior to searching, that all documents containing any of the words be returned, that the results list be statistically sorted, and that no more than the top MaxResults list items be returned where MaxResults is the current value returned from DtSearchGetMaxResults. Note that a type P search is identical to a type S boolean search with an implied boolean OR between words. Search types W and S are boolean query searches. They indicate that the query string is a sequence of words and boolean operators matching the syntax described under "Types S and W Boolean Query Strings" above. Type S requests that words be stemmed prior to searching. Type 'W' requests that words be left unstemmed. Both types request that all documents containing the combinations of query words specified by the boolean operations be returned, that the results list be statistically sorted if possible, and that no more than the top MaxResults list items be returned whereMaxResults is the cur- rent value returned from DtSearchGetMaxResults. dbname Specifies which database is to be searched. It is any one of the database name strings returned from DtSearchInit or DtSearchReinit. If dbname is NULL, the first database name string is used. Within the specified database, searches will be restricted to those documents whose DtSrKeytype.is_selected field is nonzero. date1 and date2" 10 Specify a range of document dates to use for the search. Only documents within the specified range will be returned on the results list. date1 is the older end of the range and if not NULL, requests DtSearch to return only those records younger than (that is, after) the specified date. date2 is the younger end of the range and if not NULL, requests DtSearch to return only those records older than (that is before) the specified date. It is valid to specify just one of the arguments. Undated documents always qualify for a results list regardless of search date strings. The format of a valid date string is described in DtSearchValidDateString(3). stems and stemscount" 10 Specify a character buffer to hold parsed and stemmed words and a variable to receive the number of stored words. stems and stemscount are optional; they can be NULL. However, if either is specified, they must both be specified. If specified stemsmust point to a character buffer large enough to hold DtSrMAX_STEMCOUNT by DtSrMAXWIDTH_HWORD bytes. An array of parsed and stemmed query words will be stored here by the API for use by a later call to DtSearchHighlight. The size of the array will be stored in stemscount. results and resultscount" 10 Specify where a pointer to the results list will be stored and a variable to receive the number of items on the list. Results lists can be manipulated with several utility functions. In DtSearch, frequency of occurrence information is maintained for words across the whole database and within documents. For most queries, results lists are sorted by this statistical information and presented to the user as a "proximity" number for each doc- ument on the list. Proximity is meant to appear to a user as a distance, or a measure of the nearness of the query to the docu- ment. Conceptually, the smaller the proximity the "closer" the document is to the query and the more likely it will be valuable to the user DtSearch searches only one database at a time and returns only results lists for that single database. However, browsers often provide the illusion of simultaneous searches in multiple databases, merging the results lists by proximity when completed. Since the domain of knowledge and density of words and records may vary from database to database, the value of proximity numbers may similarly vary, and some databases may be underrepresented on merged results lists. RETURN VALUE
This function has three common return codes. DtSrOK is returned, as well as a results list and stems array, when the search was completely successful. DtSrNOTAVAIL is returned when the query was valid but the search was unsuccessful (that is, no set of documents matched the query). There are usually no messages with DtSrNOTAVAIL. DtSrFAIL is returned when the search was unsuccessful, usually because of an invalid query, and user messages on the MessageList explain why. Any API function can also return DtSrREINIT and the return codes for fatal engine errors at any time. SEE ALSO
DtSrAPI(3), DtSearchReinit(3), DtSearchGetMaxResults(3), DtSearchSetMaxResults(3), DtSearchGetKeytypes(3), DtSearchValidDateString(3), DtSearchSortResults(3), DtSearchFreeResults(3), DtSearchHighlight(3) DtSearchQuery(library call)
All times are GMT -4. The time now is 02:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy