Sponsored Content
Top Forums Shell Programming and Scripting How to remove alphabets/special characters/space in the 5th field of a tab delimited file? Post 302901494 by Don Cragun on Wednesday 14th of May 2014 05:08:31 AM
Old 05-14-2014
You said that there would never be embedded tab characters in a field in your input file, but there is a tab in the middle of field 5 in the two line record on lines 6 and 7 in your latest sample input file.

As long as there aren't any embedded tab characters immediately before or after a double quote character, the following seems to do what you want. (However, it is strange that your input file has a trailing tab character on the first line in your sample input file.)
Code:
awk '
BEGIN {	FS = OFS = "\t"
}
{	# Accumulate lines until we have a line with six fields.
#	printf("Line %d, NF %d: %s\n", NR, NF, $0)
	while(gsub(/\"\t\"/, "&") < 5) {
		rc = (getline nl)
		if(rc != 1) {
			printf("Unexpected EOF: line %d, NF %d: %s\n", NR, NF, $0)
			exit 1
		}
		$0 = $0 nl
#		printf("Line %d added, NF %d, %s\n", NR, NF, $0)
	}
	# Convert embedded tabs...
	if(gsub(/[^"]\t|\t[^"]/, "<tab>")) {
#		printf("embedded tabs replaced: %s\n", $0)
	}
	if(NR > 1 && $5 !~ /^"[0-9]*"$/) $5 = "\"\""
	print
}' file2

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Converting Space delimited file to Tab delimited file

Hi all, I have a file with single white space delimited values, I want to convert them to a tab delimited file. I tried sed, tr ... but nothing is working. Thanks, Rajeevan D (16 Replies)
Discussion started by: jeevs81
16 Replies

2. Shell Programming and Scripting

insert a field into a tab delimited file

Hello, Can someone help me to do this with awk or sed? I have a file with multiple lines, each line has many fields separated with a tab. I would like to add one more field holding 'na' in between the first and second fields. old file looks like, 1, field1 field2 field3 ... 2, field1... (7 Replies)
Discussion started by: ssshen
7 Replies

3. UNIX for Dummies Questions & Answers

Insert Field into a tab-delimited file

Hello, I have about 100 files in a directory with fields which are tab delimited. I would like to append the file name as the first field and it has to be done as many times as the total lines in the file. For example, myFile1.txt has the following data: 1 x y z 2 a b ... (5 Replies)
Discussion started by: Gussifinknottle
5 Replies

4. Shell Programming and Scripting

Merging files into a single tab delimited file with a space separating

I have a folder that contains say 50 files in a sequential order: cdf_1.txt cdf_2.txt cdf_3.txt cdf_3.txt . . . cdf_50.txt. I need to merge these files in the same order into a single tab delimited file. I used the following shell script: for x in {1..50}; do cat cdf_${x}.txt >>... (3 Replies)
Discussion started by: Lucky Ali
3 Replies

5. Shell Programming and Scripting

Remove the special characters from field

Hi, In source data few of columns are having special charates(like *) due to this i am not able to display the data into flat file.it's displaying the some of junk data into the flat file. source dataExample: Address1="XDERFTG * HYJUYTG" how to remove the special charates in a string (2 Replies)
Discussion started by: koti_rama
2 Replies

6. UNIX for Dummies Questions & Answers

Changing only the first space to a tab in a space delimited text file

Hi, I have a space delimited text file but I only want to change the first space to a tab and keep the rest of the spaces intact. How do I go about doing that? Thanks! (3 Replies)
Discussion started by: evelibertine
3 Replies

7. Shell Programming and Scripting

How to make tab delimited file to space delimited?

Hi How to make tab delimited file to space delimited? in put file: ABC kgy jkh ghj ash kjl o/p file: ABC kgy jkh ghj ash kjl Use code tags, thanks. (1 Reply)
Discussion started by: jagdishrout
1 Replies

8. Shell Programming and Scripting

How to convert space&tab delimited file to CSV?

Hello, I have a text file with space and tab (mixed) delimited file and need to convert into CSV. # cat test.txt /dev/rmt/tsmmt32 HP Ultrium 6-SCSI J3LZ 50:03:08:c0:02:72:c0:b5 F00272C0B5 0/0/6/1/1.145.17.255.0.0.0 /dev/rmt/c102t0d0BEST /dev/rmt/tsmmt37 ... (6 Replies)
Discussion started by: prvnrk
6 Replies

9. Shell Programming and Scripting

Remove blank columns from a tab delimited text file

Hello, I have some tab delimited files that may contain blank columns. I would like to delete the blank columns if they exist. There is no clear pattern for when a blank occurs. I was thinking of using sed to replace instances of double tab with blank, sed 's/\t\t//g' All of the examples... (2 Replies)
Discussion started by: LMHmedchem
2 Replies

10. Shell Programming and Scripting

Output file with <Tab> or <Space> Delimited

Input file: xyz,pqrs.lmno,NA,NA,NA,NA,NA,NA,NA abcd,pqrs.xyz,NA,NA,NA,NA,NA,NA,NA Expected Output: xyz pqrs.lmno NA NA NA NA NA NA NA abcd pqrs.xyz NA NA NA NA NA NA NA Command Tried so far: awk -F"," 'BEGIN{OFS=" ";} {print}' $File_Path/File_Name.csv Issue:... (5 Replies)
Discussion started by: TechGyaann
5 Replies
newform(1)						      General Commands Manual							newform(1)

NAME
newform - change or reformat a text file SYNOPSIS
[file]... DESCRIPTION
reads lines from the named files, or standard input if no input file is named, and reproduces the lines on standard output. Lines are reformatted in accordance with command line options in effect. Command line options can appear in any order, can be repeated, and can be intermingled with the optional files. Command line options are processed in the order specified. This means that option sequences such as yield results different from Options are applied to all files on the command line. Options recognizes the following options: Same as except characters are appended to the end of a line. Truncate n characters from the beginning of the line when the line length is greater than the effective line length (see The default is to truncate the number of characters necessary to obtain the effective line length. The default value is used when with no n is used. This option can be used to delete the sequence numbers from a COBOL program as follows: The must be used to set the effective line length shorter than any existing line in the file so that the option is activated. Change the prefix/append character to k. The default character for k is a space. Same as except that characters are truncated from the end of the line. Write the tab specification format line on the standard output before any other lines are output. The tab specification format line which is printed will correspond to the format specified in the option. If no option is specified, the line which is printed contains the default specification of Input tab specification: expands tabs to spaces, according to the tab specifications given. The tabspec recognizes all tab specification forms described in tabs(1). In addition, tabspec can be in which assumes that the tab specification is to be found in the first line read from the standard input (see fspec(4)). If no tabspec is given, tabspec defaults to A tabspec of expects no tabs; if any are found, they are treated as Set the effective line length to n characters. If n is not entered, defaults to 72. The default line length without the option is 80 characters. Note that tabs and backspaces are treated as single characters (use to expand tabs to spaces). Output tab specification: replaces spaces with tabs, according to the tab specifications given. The tab specifications are the same as for If no tabspec is given, tabspec defaults to A tabspec of means that no spaces will be converted to tabs on output. Prefix n characters (see to the beginning of a line when the line length is less than the effective line length. The default is to prefix the number of characters necessary to obtain the effective line length. Shear off leading characters on each line up to the first tab and place up to 8 of the sheared characters at the end of the line. If more than 8 characters (not counting the first tab) are sheared, the eighth character is replaced by a and any characters to the right of it are discarded. The first tab is always discarded. An error message and program exit occur if this option is used on a file without a tab on each line. The characters sheared off are saved internally until all other options specified are applied to that line. The characters are then added at the end of the processed line. For example, to convert a file with leading digits, one or more tabs, and text on each line, to a file beginning with the text, all tabs after the first expanded to spaces, padded with spaces out to column 72 (or truncated to column 72), and the leading digits placed starting at column 73, the command would be: RETURN VALUE
returns one of the following values: No errors encountered. An error occurred. DIAGNOSTICS
All diagnostics are fatal. was called with a bad option. There was no tab on one line. Self-explanatory. A line exceeds 512 characters after being expanded in the internal work buffer. A tab specification is incorrectly formatted, or specified tab stops are not ascending. A tabspec read from a file (or standard input) must not contain a tabspec referencing another file (or standard input). WARNINGS
normally only keeps track of physical characters; however, for the and options, keeps track of backspaces in order to line up tabs in the appropriate logical columns. does not prompt the user if a tabspec is to be read from the standard input (by use of or If the option is used, and the last option specified was and was preceded by either a or a the tab specification format line will be incor- rect. SEE ALSO
csplit(1), tabs(1), fspec(4). newform(1)
All times are GMT -4. The time now is 03:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy