Sponsored Content
Special Forums News, Links, Events and Announcements UNIX and Linux RSS News Extremely Fast Text Feature Extraction for Classification and Indexing Post 302228060 by Linux Bot on Friday 22nd of August 2008 05:20:06 PM
Old 08-22-2008
Extremely Fast Text Feature Extraction for Classification and Indexing

HPL-2008-91R1 Extremely Fast Text Feature Extraction for Classification and Indexing - Forman, George; Kirshenbaum, Evan
Keyword(s): text mining, text indexing, bag-of-words, feature engineering, feature extraction, document categorization, text tokenization
Abstract: Most research in speeding up text mining involves algorithmic improvements to induction algorithms, and yet for many large scale applications, such as classifying or indexing large document repositories, the time spent extracting word features from texts can itself greatly exceed the initial trainin ...
Full Report

More...
 

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

extraction of perfect text from file.

Hi All, I have a file of the following format. <?xml version='1.0' encoding='utf-8'?> <tomcat-users> <role rolename="tomcat"/> <role rolename="role1"/> <role rolename="manager"/> <role rolename="admin"/> <user username="tomcat" password="tomcat" roles="tomcat"/> <user... (5 Replies)
Discussion started by: nua7
5 Replies

2. UNIX for Dummies Questions & Answers

String extraction from a text file

The following script code works great for extracting 'postmaster' from a line of text stored in a variable named string: string="PenaltyError:=554 5.7.1 Error, send your mail to postmaster@LOCALDOMAIN" stuff=$( echo $string | cut -d@ -f1 | awk '{ print $NF }' ) echo $stuff However, I need to be... (9 Replies)
Discussion started by: cleanden
9 Replies

3. Programming

Fast string removal from large text collection

Hi All, I don't want any codes for this problem. Just suggestions: I have a huge collection of text files (around 300,000) which look like this: 1.fil orange apple dskjdsk computer skjks The entire text collection (referenced above) has about 1 billion words. I have created... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

4. UNIX for Dummies Questions & Answers

fast sequence extraction

Hi everyone, I have a large text file containing DNA sequences in fasta format as follows: >someseq GAACTTGAGATCCGGGGAGCAGTGGATCTC CACCAGCGGCCAGAACTGGTGCACCTCCAG GCCAGCCTCGTCCTGCGTGTC >another seq GGCATTTTTGTGTAATTTTTGGCTGGATGAGGT GACATTTTCATTACTACCATTTTGGAGTACA >seq3450... (4 Replies)
Discussion started by: Fahmida
4 Replies

5. Shell Programming and Scripting

sed text extraction between 2 patterns using variables

Hi everyone! I'm writting a function in .bashrc to extract some text from a file. The file looks like this: " random text Begin CG step 1 random text Begin CG step 2 ... Begin CG step 100 random text" For a given number, let's say 70, I want all the text between "Begin CG... (4 Replies)
Discussion started by: radudownload
4 Replies

6. Shell Programming and Scripting

Text extraction

Dear All, I am trying to extract text from a file containing cron entries. cat /var/tmp/cron_backups/debmed_tmp < * * * * * /bell > * * * * * /belly what I am trying to do is create two text files containing all entries that begin with < and another text files containing entries with > .... (4 Replies)
Discussion started by: Junaid Subhani
4 Replies
xscreensaver-text(1)                                            XScreenSaver manual                                           xscreensaver-text(1)

NAME
xscreensaver-text - prints some text to stdout, for use by screen savers. SYNOPSIS
xscreensaver-text [--verbose] [--columns N] [--text STRING] [--file PATH] [--program CMD] [--url URL] DESCRIPTION
The xscreensaver-text script prints out some text for use by various screensavers, according to the options set in the ~/.xscreensaver file. This may dump the contents of a file, run a program, or load a URL. OPTIONS
xscreensaver-text accepts the following options: --columns N or --cols N Where to wrap lines; default 72 columns. --verbose or -v Print diagnostics to stderr. Multiple -v switches increase the amount of output. Command line options may be used to override the settings in the ~/.xscreensaver file: --string STRING Print the given string. It may contain % escape sequences as per strftime(2). --file PATH Print the contents of the given file. If --cols is specified, re-wrap the lines; otherwise, print them as-is. --program CMD Run the given program and print its output. If --cols is specified, re-wrap the output. --url HTTP-URL Download and print the contents of the HTTP document. If it contains HTML, RSS, or Atom, it will be converted to plain-text. Note: this re-downloads the document every time it is run! It might be considered abusive for you to point this at a web server that you do not control! ENVIRONMENT
HTTP_PROXY or http_proxy to get the default HTTP proxy host and port. BUGS
The RSS and Atom output is always ISO-8859-1, regardless of locale. URLs should be cached, use "If-Modified-Since", and obey "Expires". SEE ALSO
xscreensaver-demo(1), xscreensaver(1), fortune(1), phosphor(1), apple2(1), starwars(1), fontglide(1), dadadodo(1), webcollage(1), http://www.livejournal.com/stats/latest-rss.bml, http://twitter.com/statuses/public_timeline.atom, driftnet(1), EtherPEG, EtherPeek COPYRIGHT
Copyright (C) 2005 by Jamie Zawinski. Permission to use, copy, modify, distribute, and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. No representations are made about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty. AUTHOR
Jamie Zawinski <jwz@jwz.org>, 20-Mar-2005. X Version 11 5.15 (28-Sep-2011) xscreensaver-text(1)
All times are GMT -4. The time now is 01:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy