Retaining spaces between words


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Retaining spaces between words
# 1  
Old 09-10-2007
Question Retaining spaces between words

Retaining Spaces within a word

--------------------------------------------------------------------------------

Hi Experts,

I have a 2 GB flat file which have unicode field, some of them are blanks and its size is 4000 character. In the existing system SED command removes the spaces. Because of this field itself....it is taking almost three days to complete the file processing. I removed sed and used tr command...it worked in less than a minute. Now the challenging part is the character fields have more than one space, I am tr -s ' ' '' to remove the spaces, but it is removing the spaces inbetween the characters which is more than one space.


My sample record is this:

262774372|58959454 | Rajiv Rajiv | tuerueeu | | erueirei
647585858|784783434 | Ramesha Ramesha| tyuu5u4o| | ruieieiei

Earlier following is the command used to remove spaces:

sed s/[[:space]]*|/|/g; s/[ \t]*$//g < File1 > File2

Output was:
262774372|58959454|Rajiv Rajiv|tuerueeu||erueirei
647585858|784783434|Ramesha Ramesha|tyuu5u4o||ruieieiei

Time taken to process file was 3.5 days

Later I added tr command before the sed to remove spaces faster by adding the following

tr -s ' ' '' < File1 > File2
sed 's/[[:space]]*|/|/g; s/[ \t]*$//g;s/^[ \t]*//g;' < File 2 > File3

Output was:
262774372|58959454|Rajiv Rajiv|tuerueeu||erueirei
647585858|784783434| Ramesha Ramesha|tyuu5u4o||ruieieiei

Time taken to process file was less than a minute, since the big spaces are translated faster.

I am not able to retain the spaces between the characters as is, since tr -s will squeeze the space to one space.

The value | Rajiv Rajiv | -> changed to |Rajiv Rajiv|

I have to retain the space..... ie., |Rajiv Rajiv|

Please let me know if you have any workaround...

Thanks,
Rajiv
# 2  
Old 09-11-2007
The following should work for you.

tr -d "[= =]" < infile > outfile


additionally [:space:] similar to your sed statement is also supported in tr
# 3  
Old 09-11-2007
Retaining spaces within words

Denn,

It is eliminating all the spaces that exists between the words.

eg., if I have a data like this

"Rajiv | Rajiv Rajiv Rajiv |Rajiv Rajiv"

If I use the command suggested by you will result in the output
"Rajiv|RajivRajivRajiv|RajivRajiv"

I need the output in the following format
"Rajiv|Rajiv Rajiv Rajiv|Rajiv Rajiv"

Thanks,
Rajiv
# 4  
Old 07-29-2008
Hi Rajiv,

Did you get the Solution for the above Problem?
please help me. I am also facing the similar problem.

Thanks,
Deepak
# 5  
Old 07-29-2008
MySQL Retaining spaces between words

Yes, I was able to achieve it....

here is the command....

cat filename | awk 'BEGIN{FS=OFS="|"} {for(i=1;i<=NF;i++)gsub("(^[[:space:]]*)|([[:space:]]*$)","",$i)};1' | awk 'NF > 0' > Output_Filename.txt


Thanks,
Rajiv
# 6  
Old 07-29-2008
MySQL Retaining spaces between words

Hi Deepak,

You should have to change the delimiter, in my case delimiter was pipe '|' so you should change OFS="|" with whatever delimiter you have.

Thanks,
Rajiv
# 7  
Old 07-30-2008
Hi Rajiv,

Thanks for your reply.
I am using this command sed 's/ | /|/g' temp.dat>temp1
After reading your post i am scared to use sed command.

Thanks,
Deep
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies

2. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

3. Shell Programming and Scripting

Remove spaces from between words that are in a field

Hi all, Is there a sed/awk cmd that will remove blank space from between words in a particular field, replacing with a single space? Field containing 'E's in the example below: Example input file: AAAAA AA|BBBB|CCCCCCC|DDDDDD |EEEE EEEEEE| FFF FFFFF| ... (6 Replies)
Discussion started by: dendright
6 Replies

4. Shell Programming and Scripting

Grep words with spaces and save the output

I have a file that contains the schedule for a tournament with 41 teams. The team names have spaces in them. I would like to search for each teams schedule and then save that to that teams file For example Team name: "Team Two" I would like to search for all the games for "Team Two" and... (8 Replies)
Discussion started by: knijjar
8 Replies

5. Shell Programming and Scripting

Concatenating words without spaces.

Hi All, I have written a C program to solve this problem but I am eager to know whether the same output can be obtained using sed or awk? This is the input: star ferry computer symbol prime time This is the output: starferry ferrycomputer computersymbol symbolprime primetime (7 Replies)
Discussion started by: shoaibjameel123
7 Replies

6. Shell Programming and Scripting

Insert varying length spaces between words

Hey all, Fist post, so be kind... I have written an expect script which logs into a terminal and gathers several screens of information. Unfortunately the log file gives me all the special escape and control characters from the terminal. I am hoping to use a combination of shell scripting, sed,... (1 Reply)
Discussion started by: mpacer
1 Replies

7. Shell Programming and Scripting

Shell script to find out words, replace them and count words

hello, i 'd like your help about a bash script which: 1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel, 2.finds the link which leads to the download location of the Latest Stable Kernel version, (the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies

8. Programming

Counting characters, words, spaces, punctuations, etc.

I am very new to C programming. How could I write a C program that could count the characters, words, spaces, and punctuations in a text file? Any help will be really appreciated. I am doing this as part of my C learning exercise. Thanks, Ajay (4 Replies)
Discussion started by: ajay41aj
4 Replies

9. UNIX for Dummies Questions & Answers

Retaining Spaces within a word

Hi Experts, I have a 2 GB flat file which have unicode field, some of them are blanks and its size is 4000 character. In the existing system SED command removes the spaces. Because of this field itself....it is taking almost three days to complete the file processing. I removed sed and... (0 Replies)
Discussion started by: RcR
0 Replies

10. Shell Programming and Scripting

Retaining Spaces while redirecting output

I need to merge data from more than one file and I am using while read line_record do field1=`echo $line_record | awk -F "," '{ print $1 }'` echo $line_record >> $outFile if then while read new_linerec do echo $new_linerec... (3 Replies)
Discussion started by: skrakesh
3 Replies
Login or Register to Ask a Question