08-07-2017
hmmm...interesting...isn't the record separator a newline now?
What if one sentence spans multiple newlines? Won't it be counted as two or more sentences?
Also, I don't understand exactly what the sub command is doing?
thank you
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Ive got a file with words and also numbers.
Bla BLA
10 10
11 29
12 89
13 35
And i need to change "10,29,89,25" and also remove anything that contains actually words... (4 Replies)
Discussion started by: maskot
4 Replies
2. UNIX for Dummies Questions & Answers
I can print a line with an expression using this:
awk '/regex/'
I can print the line immediately before an expression using this:
awk '/regex/{print x};{x=$0}'
How do I print the line immediately before and then the line with the expression? (2 Replies)
Discussion started by: nickg
2 Replies
3. Shell Programming and Scripting
Hi All,
I've got some strange behaviour going on when trying to manipulate a file that contains spaces.
My input file looks something like this:
xxxxxxxxx,yyyy,sss sss sss,bbbbbbb
If I use awk:
When running from the command line I get:
sss sss sss
But when running from a... (7 Replies)
Discussion started by: pondlife
7 Replies
4. Shell Programming and Scripting
Hello world,
I was wondering if there is a nicer way to write the following code (in AWK):
awk '
FNR==NR&&$1~/^m$/{tok1=1}
FNR==NR&&$1~/^m10$/{tok1=1}
' my_file
In fact, it looks for m2, m4, m6, m8 and m10 and then return a positive flag. The problem is how to define 10 thanks... (3 Replies)
Discussion started by: jolecanard
3 Replies
5. Shell Programming and Scripting
How do I use double quotes as a record seperator in awk? (4 Replies)
Discussion started by: locoroco
4 Replies
6. Shell Programming and Scripting
I'm working on a different stage of a project that someone helped me address elsewhere in these threads.
The .docs I'm cycling through look roughly like this:
1 of 26 DOCUMENTS
Copyright 2010 The Age Company Limited
All Rights Reserved
The Age (Melbourne, Australia)
November 27, 2010... (9 Replies)
Discussion started by: spindoctor
9 Replies
7. Shell Programming and Scripting
Hi,
I have a bunch of records within a directory where each one has this form:
(example file1)
1 2 50 90 80 90 43512 98 0909 79869 -9 7878 33222 8787 9090 89898 7878 8989 7878 6767 89 89 78676 9898 000 7878 5656 5454 5454
and i want for all of these files to be... (3 Replies)
Discussion started by: amarn
3 Replies
8. Shell Programming and Scripting
How do I use single quotes as record separator in awk?
I just couldn't figure that out. I know how to use single quotes as field separator, and double quotes as both field and record separator ... (1 Reply)
Discussion started by: locoroco
1 Replies
9. Programming
Hi all,
How am I read a file, find the match regular expression and overwrite to the same files.
open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat";
open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat";
while (<DESTINATION_FILE>)
{
# print... (1 Reply)
Discussion started by: jessy83
1 Replies
10. Shell Programming and Scripting
Hello to all,
Please some help on this. I have the file in format as below.
How can I set the record separator as the string below in red
"No. Time Source Destination Protocol Length Info"
I've tried code below but it doesn't seem to... (6 Replies)
Discussion started by: cgkmal
6 Replies
LEARN ABOUT CENTOS
voikkogc
VOIKKOGC(1) General Commands Manual VOIKKOGC(1)
NAME
voikkogc - test program for Voikko grammar checker
SYNOPSIS
voikkogc [options]
DESCRIPTION
voikkogc is a test program for grammar checking functionality in libvoikko, library of Finnish language tools. It reads sentences or para-
graphs from stdin (one per line) and print the results to stdout. The results are structures containing information about grammar errors
found in the input paragraph.
OPTIONS
--tokenize
Instead of looking for grammar errors, split input into tokens. The tokens are prefixed by type: "W" is a word, "P" is punctuation,
"S" is whitespace, "U" is unknown and "E" is a prefix for error messages.
--split-sentences
Instead of looking for grammar errors, split input into sentences. The sentences are prefixed by type: "B" means that end of sen-
tence is a probably correct, "P" means that end of sentence is a possibly correct (but probably this and the next identified sen-
tence should be joined) and "E" means that sentence ends at the end of input.
-n Prefix all grammar checker messages with line number of input data.
accept_titles=n
accept_unfinished_paragraphs=n
accept_bulleted_lists=n
Set the value of the specified boolean option.
explanation_language=langcode
Print human readable error explanation in the specified language.
BUGS
Human readable error explanations are printed in UTF-8 regardless of current locale settings.
SEE ALSO
voikkospell for common options of different Voikko test tools.
AUTHOR
voikkogc and this manual page were written by Harri Pitkanen (hatapitk@iki.fi).
2010-05-06 VOIKKOGC(1)