Hi All,
I need to select only those records having a non zero record in the first column of a comma delimited file.
Suppose my input file is having data like:
"0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005"
"0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
I'll try explain this as best I can. Let me know if it is not clear.
I have large text files that contain data as such:
143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test
line 2 test
line 3 test
143593503 09-08-20 09:02:13... (3 Replies)
Hello People,
I have the following contents in an XML file
...........
...........
..........
...........
<Details = "Sample Details">
<Name>Bob</Name>
<Age>34</Age>
<Address>CA</Address>
<ContactNumber>1234</ContactNumber>
</Details>
...........
.............
.............. (4 Replies)
I have a huge file (about 2 millions records) contains data separated by “,” (comma). As part of the requirement, I can't change the format. The objective is to remove some of the records with the following condition. If the 23rd field on each line start with 302 , I need to remove that from the... (4 Replies)
Hello Friends,
I have a file(InputFile.csv) with the following columns(the columns are pipe-delimited):
ColA|ColB|ColC|ColD|ColE|ColF
Now for this file, I have to get those records which fulfil the following condition:
If "ColB" is NOT NULL and "ColD" has values one of the following... (9 Replies)
Hi,
I have a shell script which extracts records form oracle to unix file.
sqlplus -s ${WMD_DM_CONNECT} <<EOF >$tmpfile
set heading off
set pagesize 0
set feedback off
select CD_DESC||'|'||CD_ID||'|'||'Arun'||'|'||'Montu' from WMD_SYS_CD_LKUP
where CD_TYP =... (5 Replies)
I have a folder containing text files. I need to extract specific lines from the files of this folder based on another file input.txt. How can I do this with awk/sed?
file1
ARG 81.9 8 81.9 0
LEU 27.1 9 27.1 0
PHE .0 10 .0 0
ASP 59.8 11 59.8 0
ASN 27.6 12 27.6 0
ALA .0 13 .0 0... (5 Replies)
In the file below I am trying to extract a specific instance of path, if the adjacent plugin": "/rundb/api/v1/plugin/49/. Thank you :).
file
"path": "/results/analysis/output/Home/Auto_user_S5-00580-4-Medexome_65_028/plugin_out/FileExporter_out.52", "plugin": "/rundb/api/v1/plugin/49/",... (8 Replies)
I'm new to shell programming, I have a huge text file in the following format, where columns are separated by single space:
ACA MEX 4O_ $98.00 $127.40 $166.60 0:00 0:00 0 ;
ACA YUL TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ;
ACA YYZ TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ;
ADZ YUL TS_ $300.00... (3 Replies)
I have two files and would need to filter out records based on certain criteria, these column are of variable lengths, but the lengths are uniform throughout all the records of the file. I have shown a sample of three records below. Line 1-9 is the item number "0227546_1" in the case of the first... (15 Replies)
Discussion started by: MIA651
15 Replies
LEARN ABOUT DEBIAN
mmseg
MMSEG(1) User Contributed Perl Documentation MMSEG(1)NAME
mmseg - maximum matching segment Chinese text.
SYNOPSIS
mmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. mmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS -d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC will not be segmented, in binary mode, the AMBI-ID
is written out; in text mode, "<ambi>ABC</ambi>" will be output. Default is 0.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO slmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 MMSEG(1)