Search Results

Showing results 1 to 25 of 145 Search took 0.02 seconds. Search: Posts Made By: gimley

Forum: UNIX for Beginners Questions & Answers 03-03-2020

Out of memory message

3,635

Posted By

Is 700 KB a mistake ? Doesn't sound like a...

Is 700 KB a mistake ?
Doesn't sound like a large file to me....

Can you show input and required output (a small portion of of course).

What 'DOS' are you referring to, what awk are you using...

Forum: UNIX for Beginners Questions & Answers 03-03-2020

Out of memory message

3,635

Posted By

Hi, I just checked your script on a linux...

Hi,

I just checked your script on a linux system without any output with a file with 1/3 million words in it (filesize 2700 KB: I used this file: wordlist.xz (http://www.megabert.de/wordlist.xz))....

Forum: Shell Programming and Scripting 01-28-2019

Find Syllable count mismatch

1,590

Posted By

Well, I have to go now. So - one for the road: ...

Well, I have to go now. So - one for the road:
awk -F= 'split($1,T,"|") != split($2,T,"|")' file
zu|ba|i|dA=ज़ु|बै|दा
zu|ba|i|r=ज़ु|बै|र

Forum: Shell Programming and Scripting 10-14-2018

Remove dupes in a large file

3,430

Posted By

Exactly, X[$0]++ holds a number value; i.e. each...

Exactly, X[$0]++ holds a number value; i.e. each new line consumes a number's space.

Forum: Shell Programming and Scripting 10-13-2018

Remove dupes in a large file

3,430

Posted By

In case there is a RAM shortage, the following...

In case there is a RAM shortage, the following variant helps (saves some bytes per line).
awk '!($0 in X) { print; X[$0] }' file > file.dedup

Forum: Shell Programming and Scripting 10-13-2018

Remove dupes in a large file

3,430

Posted By

Hi, I presume you mean you want to dedupe...

Hi,

I presume you mean you want to dedupe the file (because that is what your script does and that is in the title), not necessarily sort it.

You can try the difference between
awk '!X[$0]++'...

Forum: Shell Programming and Scripting 06-11-2018

Creating verbal structures from a dictionary and a template

1,807

Posted By

How about awk -F= 'FNR == NR {if (NR > 1)...

How about
awk -F= 'FNR == NR {if (NR > 1) TA[$1] = $2; next} {TMP = $0; for (t in TA) {$0 = TMP; sub ("\|", t); sub ("#", TA[t]); print}}' file1 file2?
For "go", tryawk -F= 'FNR == NR {if (NR > 1)...

Forum: Shell Programming and Scripting 06-11-2018

Creating verbal structures from a dictionary and a template

1,807

Posted By

You forgot one essential thing: setting the field...

You forgot one essential thing: setting the field separator to = .

Forum: Shell Programming and Scripting 05-20-2018

Help with script to convert rows to columns

999

Posted By

Note that although your printf happens to work...

Note that although your printf happens to work with the data you're using, it is dangerous to assume that no characters in data you're printing will ever be interpreted as format string control...

Forum: Shell Programming and Scripting 05-20-2018

Help with script to convert rows to columns

999

Posted By

You weren't too far off. Try FS="[;=]".

You weren't too far off. Try FS="[;=]".

Forum: Shell Programming and Scripting 05-09-2018

Modification of perl script to split a large file into chunks of 5000 chracters

3,219

Posted By

That may also be why your perl has issues as...

That may also be why your perl has issues as well. UTF8 characters encode all of Unicode 1,112,064 characters, so a UTF8 character may be 8, 16, 24, or 32 bits.

To fix perl will require the...

Forum: Shell Programming and Scripting 05-08-2018

Modification of perl script to split a large file into chunks of 5000 chracters

3,219

Posted By

As an aside, there is a split command that does...

As an aside, there is a split command that does exactly what you ask.

split -b [size in bytes ] infile [option control outfile naming]

Linux man page:

split(1) - Linux manual page...

Forum: Shell Programming and Scripting 04-19-2018

Help to identify blank space in a file

1,173

Posted By

True. Still, as a measure of safety i would rule...

True. Still, as a measure of safety i would rule out trailing or leading spaces:

sed -n '/^[[:blank]]*//;s/[[:blank:]]*$//;/ /!p' > /result/file

I hope this helps.

bakunin

Forum: Shell Programming and Scripting 04-19-2018

Help to identify blank space in a file

1,173

Posted By

For instance using grep: grep -v '[^ ] [^ ]'...

For instance using grep:

grep -v '[^ ] [^ ]' your_file

Forum: Shell Programming and Scripting 10-24-2017

Find and replace in a file from another file

1,110

Posted By

How about sed 's/[[:punct:]]/ &/g' file s...

How about
sed 's/[[:punct:]]/ &/g' file
s 'est
l 'air
d 'homme
l 'issue
bleu -blanc -rouge
(SDF )
a -t -il ?

Forum: Shell Programming and Scripting 10-24-2017

Find and replace in a file from another file

1,110

Posted By

Any attempts / ideas / thoughts from your side? ...

Any attempts / ideas / thoughts from your side?

Is the list given complete, or does your request apply to ALL punctuation chars?

Forum: Shell Programming and Scripting 08-27-2017

Regex to identify illegal characters in a perso-arabic database

2,780

Posted By

To bring what MadeInGermany said directly into...

To bring what MadeInGermany said directly into your problem statement...

If the following characters are the only legal characters on a line written in Sindhi:...

Forum: Shell Programming and Scripting 06-05-2017

Matching number of syllables on right-hand and left side

983

Posted By

Run as perl separate.pl gimley.example use...

Run as perl separate.pl gimley.example
use strict;
use warnings;

my $clean = 'clean.gmly';
my $inconsistent = 'inconsistent.gmly';

open my $clean_fh, '>', $clean or die;
open my...

Forum: Shell Programming and Scripting 06-05-2017

Matching number of syllables on right-hand and left side

983

Posted By

Try: awk -F= 'split($1,F," ")!=split($2,F,"...

Try:
awk -F= 'split($1,F," ")!=split($2,F," "){print>f; next}1' f=file.bad file > file.good

Forum: Shell Programming and Scripting 05-01-2017

Why does this awk script not work correctly?

1,227

Posted By

The for loop can be shortened, and a classic...

The for loop can be shortened, and a classic split trick clears an array.
BEGIN { FS = "[=,]"
}
{ o = $1 "=" $2
s[$2]
for(i = 3; i <= NF; i++)
if(!($i in...

Forum: Shell Programming and Scripting 05-01-2017

Why does this awk script not work correctly?

1,227

Posted By

If the order of the indic glosses is...

If the order of the indic glosses is unimporrtant, try also
awk -F= '
{for (MX=n=split($2, T, ","); n>0; n--) C[T[n]]
printf "%s=", $1
DL = ""
for (c in C) ...

Forum: Shell Programming and Scripting 05-01-2017

Why does this awk script not work correctly?

1,227

Posted By

Assuming that the order of the order of the indic...

Assuming that the order of the order of the indic glosses has to be kept as they appear in the input (only removing duplicated indic glosses), assuming that you're using a version of awk that...

Forum: Shell Programming and Scripting 04-30-2017

Awkscript to reduce words delimited with comma on right hand to columns

1,379

Posted By

Consolidate the fields: awk -F '[,=]' '{...

Consolidate the fields:

awk -F '[,=]' '{ for(i=1; i<NF; i++) { printf("%s=%s\n", $(i), $(NF) )}} ' filename > newfile

This will not work with older versions of awk.

Forum: Shell Programming and Scripting 04-30-2017

Awkscript to reduce words delimited with comma on right hand to columns

1,379

Posted By

Why not adapt / improve your own approach: awk...

Why not adapt / improve your own approach:
awk 'BEGIN{FS="="}
{n=split($1,a,",");for (i=1;i<=n;i++) print a[i]"="$2}' file

Forum: Shell Programming and Scripting 04-03-2017

Regex to hunt for a string in the right hand column

1,282

Posted By

Please, try the following: perl -ne '/^\w+=.+-/...

Please, try the following:
perl -ne '/^\w+=.+-/ and print'

Or test with any regex engine that suport Perl regex.

/^\w+=.+-/

Showing results 1 to 25 of 145