Added a space in front of the matching regex so it won't catch words like 'SEA|'
Is it OK to assume that you want to cut lines by words that are always prefixed by a space and have a pipe (|) just after?
(no big deal, but modified a bit the sed script to avoid repeating '|' in the regex)
All,
I have a requirement where I will need to split a line into multiple lines.
Ex:
Input:
2ABCDEFGH2POIYUY2ASDGGF2QWERTY
Output:
2ABCDEFGH
2POIYUY
2ASDGGF
2QWERTY
The data is of no fixed lenght. Only the lines have to start with 2.
How can this be done. (5 Replies)
I am trying to figure out how to split a file when the data in the new line is different from the current line using a shell script?
For eg..
if my input file contains the following
2341123 ABCAD
2341123 ANCAED
2341123 AVADV
3343434 ASDVAV
3343434 ASDFADF
4231232 ADACVAV
4231232... (3 Replies)
Hi,
I want to split before reading the complete line as the line is very big and its throwing out of memory. can you suggest.
when i say
#cat $inputFile | while read eachLine
and use the eachLine to split its throwing out of memory as the line size is more than 10000000 characters.
Can you... (1 Reply)
Dear All,
I want to split single line into two line or three lines wherever “|” separated values comes using
Input line
test,DEMTEMPUT20100404010012,,,,,,,,|0070086|0070087,
output shoule be
test,DEMTEMPUT20100404010012,,,,,,,,0070086,
test,DEMTEMPUT20100404010012,,,,,,,,0070087, (14 Replies)
Hi all,
I have logs(in a log file) with the following structure
20100916011501559;0.812;null;TRUE;;FALSE;0.812;0;0;;19
20100916011504762;0.015;null;TRUE;;FALSE;0;4|4;0.015;;4
20100916011504762;0;null;TRUE;;FALSE;0;0;0;;4
20100916011501731;3.343;null;TRUE;;FALSE;3.156;131|65;0.172;;11... (14 Replies)
The requirement is, there is a log file which contains a huge data. i need to get a particular field out of it by searching with another field.
ex:
2011-03-28 13:00:07,423 : millis=231 q={ call get_data_account(?,?,?,?,?) }, params=
i need to search for the word "get_data_account" in file... (1 Reply)
HI Guys,
I need to split the file in to number of files . file contains FILEHEADER and EOF . I have to split n number of times . I have to form the file with each splitted message between FILEHEADER and EOF using awk beign and end . how to implement please suggest. (2 Replies)
Hi All,
I have file(File1) with data like below:
102100|LName|Gender|Company|Branch|Bday|Salary|Age
102100|bbbb|male|cccc|dddd|19900814|15000|20|
102101|asdg|male|gggg|ksgu|19911216|||
102102|bdbm|male|kkkk|acke|19931018||23|
102102|kfjg|male|kkkc|gkgg|19921213|14000|24|... (2 Replies)
Hi Gurus,
I have below JSON file, now I want to rewrite this file into a new file.
I will appreciate if anyone can help me to provide the solution...I can't use jq.
{
"_id": "3ad893cb4cf1560add7b4caffd4b6126",
"_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f",
"name":... (4 Replies)
Discussion started by: manas_ranjan
4 Replies
LEARN ABOUT DEBIAN
marc::charset
MARC::Charset(3pm) User Contributed Perl Documentation MARC::Charset(3pm)NAME
MARC::Charset - convert MARC-8 encoded strings to UTF-8
SYNOPSIS
# import the marc8_to_utf8 function
use MARC::Charset 'marc8_to_utf8';
# prepare STDOUT for utf8
binmode(STDOUT, 'utf8');
# print out some marc8 as utf8
print marc8_to_utf8($marc8_string);
DESCRIPTION
MARC::Charset allows you to turn MARC-8 encoded strings into UTF-8 strings. MARC-8 is a single byte character encoding that predates
unicode, and allows you to put non-Roman scripts in MARC bibliographic records.
http://www.loc.gov/marc/specifications/spechome.html
EXPORTS
ignore_errors()
Tells MARC::Charset whether or not to ignore all encoding errors, and returns the current setting. This is helpful if you have records
that contain both MARC8 and UNICODE characters.
my $ignore = MARC::Charset->ignore_errors();
MARC::Charset->ignore_errors(1); # ignore errors
MARC::Charset->ignore_errors(0); # DO NOT ignore errors
assume_unicode()
Tells MARC::Charset whether or not to assume UNICODE when an error is encountered in ignore_errors mode and returns the current setting.
This is helepfuli if you have records that contain both MARC8 and UNICODE characters.
my $setting = MARC::Charset->assume_unicode();
MARC::Charset->assume_unicode(1); # assume characters are unicode (utf-8)
MARC::Charset->assume_unicode(0); # DO NOT assume characters are unicode
assume_encoding()
Tells MARC::Charset whether or not to assume a specific encoding when an error is encountered in ignore_errors mode and returns the current
setting. This is helpful if you have records that contain both MARC8 and other characters.
my $setting = MARC::Charset->assume_encoding();
MARC::Charset->assume_encoding('cp850'); # assume characters are cp850
MARC::Charset->assume_encoding(''); # DO NOT assume any encoding
marc8_to_utf8()
Converts a MARC-8 encoded string to UTF-8.
my $utf8 = marc8_to_utf8($marc8);
If you'd like to ignore errors pass in a true value as the 2nd parameter or call MARC::Charset->ignore_errors() with a true value:
my $utf8 = marc8_to_utf8($marc8, 'ignore-errors');
or
MARC::Charset->ignore_errors(1);
my $utf8 = marc8_to_utf8($marc8);
utf8_to_marc8()
Will attempt to translate utf8 into marc8.
my $marc8 = utf8_to_marc8($utf8);
If you'd like to ignore errors, or characters that can't be converted to marc8 then pass in a true value as the second parameter:
my $marc8 = utf8_to_marc8($utf8, 'ignore-errors');
or
MARC::Charset->ignore_errors(1);
my $utf8 = marc8_to_utf8($marc8);
DEFAULT CHARACTER SETS
If you need to alter the default character sets you can set the $MARC::Charset::DEFAULT_G0 and $MARC::Charset::DEFAULT_G1 variables to the
appropriate character set code:
use MARC::Charset::Constants qw(:all);
$MARC::Charset::DEFAULT_G0 = BASIC_ARABIC;
$MARC::Charset::DEFAULT_G1 = EXTENDED_ARABIC;
SEE ALSO
o MARC::Charset::Constant
o MARC::Charset::Table
o MARC::Charset::Code
o MARC::Charset::Compiler
o MARC::Record
o MARC::XML
AUTHOR
Ed Summers (ehs@pobox.com)
perl v5.12.4 2011-08-05 MARC::Charset(3pm)