Not sure how to reconcile your written spec and the sampe output. Do you mean you want to insert a field by copying tolower($1) between $1 and $2? And, the count info should be the number of lines between <s> and </s>?
---------- Post updated at 13:08 ---------- Previous update was at 12:57 ----------
Assuming above thoughts to be true, try
Code:
awk '
{if ($1 ~ "<\/?s>") ST = NR
else {$1=$1 OFS tolower($1)
$3=NR-ST
}
}
1
' OFS="\t" file
<s>
He he PRP 1
could could MD 2
tell tell VB 3
she she PRP 4
was was VBD 5
teasing teasing VBG 6
him him PRP 7
. . . 8
</s>
<s>
He he PRP 1
kept kept VBD 2
his his PRP$ 3
eyes eyes NNS 4
closed closed VBD 5
, , , 6
but but CC 7
he he PRP 8
could could MD 9
feel feel VB 10
himself himself PRP 11
smiling smiling VBG 12
. . . 13
</s>
This solution words great for ascii characters. However, I have some characters that are non-ascii and they do not convert correctly when using
Code:
tolower
. For instance, this is what happened:
Code:
<s>
Pero pero cc 0
lo lo da0000 1
más m?s rg 2
importante importante aq0000 3
, , fc 4
no no rn 5
sólo s?lo rg 6
desde desde sp000 7
la la da0000 8
visión visi?n nc0s000 9
de de sp000 10
una una di0000 11
parte parte nc0s000 12
</s>
Is there a way to maintain the non-ascii character when using
Hello. Im just starting to learn awk so hang in there with me...I have a large text file formatted as such everything is in a single column
ID001
value 1
value 2
value....n
ID002
value 1
value 2
value... n
I want to be able to calculate the average for values for each ID from the... (18 Replies)
Dear Guyz:)
I have 2 different input files like this. I would like to pick the values or letters from the inputfile2 based on inputfile1 keys (A,F,N,X,Z).
I have done similar task by using awk but in that case the inputfiles are similar like in inputfile2 (all keys in 1st column and values in... (16 Replies)
Hi Experts,
The content of the raw file:
date,nomsgsent,nomsgnotdeliver,nomsgdelay
201003251000,1000,1,2
201003251000,900,0,0
201003251000,1450,0,0
201003251000,1230,0,0
However, sometimes, the column will missing in the raw files:
e.g.
date,nomsgsent,nomsgdelay... (8 Replies)
Hello,
I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2
SDF4 -0.211654
SDF4 0.978068
... (1 Reply)
please help!!!!!!
I have a file .txt that has only one column like that:
34.1
35.5
35.6
45.6
...
Now, i want to add a column in the left in which the values of this column increase by 0.4 , for example:
0.0 34.1
0.4 35.5
0.8 35.6
1.2 45.6
How can i do with awk instructions??? ... (2 Replies)
Hi all !
If there is only one single value in a column (e.g. column 1 below), then return this value in the same output column.
If there are several values in the same column (e.g. column 2 below), then return the different values separated by "," in the output.
pipe-separated input:
... (11 Replies)
Hello,
I have a table as shown below. I want to concatenate values in col2 and col3 based on a value in col4.
1 X Y A
3 Y Z B
4 A W B
5 T W A
If col4 is A, then I want to concatenate col3 with itself. Otherwise it should concateneate col2 with col3.
1 X Y YY
3 Y Z YZ... (10 Replies)
Hi,
My input files is like this
axis1 0 1 10
axis2 0 1 5
axis1 1 2 -4
axis2 2 3 -3
axis1 3 4 5
axis2 3 4 -1
axis1 4 5 -6
axis2 4 5 1
Now, these are my following tasks
1. Print a first column for every two rows that has the same value followed by a string.
2. Match on the... (3 Replies)
Please help me to get required output for both scenario 1 and scenario 2 and need separate code for both scenario 1 and scenario 2
Scenario 1
i need to do below changes only when column1 is CR and column3 has duplicates rows/values. This inputfile can contain 100 of this duplicated rows of... (1 Reply)
Discussion started by: as7951
1 Replies
LEARN ABOUT MOJAVE
locale::codes::langfam
Locale::Codes::LangFam(3pm) Perl Programmers Reference Guide Locale::Codes::LangFam(3pm)NAME
Locale::Codes::LangFam - standard codes for language extension identification
SYNOPSIS
use Locale::Codes::LangFam;
$lext = code2langfam('apa'); # $lext gets 'Apache languages'
$code = langfam2code('Apache languages'); # $code gets 'apa'
@codes = all_langfam_codes();
@names = all_langfam_names();
DESCRIPTION
The "Locale::Codes::LangFam" module provides access to standard codes used for identifying language families, such as those as defined in
ISO 639-5.
Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639-5
language family codes will be used.
SUPPORTED CODE SETS
There are several different code sets you can use for identifying language families. A code set may be specified using either a name, or a
constant that is automatically exported by this module.
For example, the two are equivalent:
$lext = code2langfam('apa','alpha');
$lext = code2langfam('apa',LOCALE_LANGFAM_ALPHA);
The codesets currently supported are:
alpha
This is the set of three-letter (lowercase) codes from ISO 639-5 such as 'apa' for Apache languages.
This is the default code set.
ROUTINES
code2langfam ( CODE [,CODESET] )
langfam2code ( NAME [,CODESET] )
langfam_code2code ( CODE ,CODESET ,CODESET2 )
all_langfam_codes ( [CODESET] )
all_langfam_names ( [CODESET] )
Locale::Codes::LangFam::rename_langfam ( CODE ,NEW_NAME [,CODESET] )
Locale::Codes::LangFam::add_langfam ( CODE ,NAME [,CODESET] )
Locale::Codes::LangFam::delete_langfam ( CODE [,CODESET] )
Locale::Codes::LangFam::add_langfam_alias ( NAME ,NEW_NAME )
Locale::Codes::LangFam::delete_langfam_alias ( NAME )
Locale::Codes::LangFam::rename_langfam_code ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangFam::add_langfam_code_alias ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangFam::delete_langfam_code_alias ( CODE [,CODESET] )
These routines are all documented in the Locale::Codes::API man page.
SEE ALSO
Locale::Codes
The Locale-Codes distribution.
Locale::Codes::API
The list of functions supported by this module.
http://www.loc.gov/standards/iso639-5/id.php
ISO 639-5 .
AUTHOR
See Locale::Codes for full author history.
Currently maintained by Sullivan Beck (sbeck@cpan.org).
COPYRIGHT
Copyright (c) 2011-2013 Sullivan Beck
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.18.2 2013-11-04 Locale::Codes::LangFam(3pm)