I don't know if this the right place to post my question to get some ideas on how to done this.
I have a text files extracted from OCR that need to have snippets to be import to database as snippet table which have columns "snippet, date, title" I dont know if shell scripts can do it with power of grep and regex command in linux or is there any opensource or commercial tools can use to do this task.
sample at google drive public share 1181.txt
or snippets would be one paragraph with atlist 400 to 500 character end with period or paragraph cut at maximum 500 character with 3 dot.
title: atlist 50 to 100 character end with period. or paragraph cut at maximum 100 character with 3 dot.
Thanks for the response
Moderator's Comments:
Please use CODE tags (as required by forum rules) when posting sample input, sample output, and code segments.
Last edited by Don Cragun; 10-09-2016 at 03:58 PM..
Reason: Add CODE and ICODE tags.
Thanks for your time and effort to create and share the script, I see your script focus in 1181.txt file, in this script can execute as bulk file, where talking about 2000 to 4000 text files to increment the result in one csv file including filename of text.
Hi,
See my attached file : this script will process all files as *.txt in the current directory .
The result file contains one line for each file , with the first field as the original file name.
---------- Post updated 10-11-16 at 10:46 AM ---------- Previous update was 10-10-16 at 05:53 PM ----------
Hi,
Here's the result after execute the script.
1. field ITEM - result looking good.
2. field DATE - Testing 100 text files and the result was 2 good out of 100. Maybe because not all content have "Month Year" in the first row.
I tried to replace from
to match date pattern, but no luck to make it work.
3. field TITLE - The length of title sometimes <=10 characters, maybe if we could add a conditions, for example First match of the title must be atlist minimum of 30 but not
exceeded to 100 characters and esc for not match and stop at first match.
4. filed SNIPPET - The length of snippet sometimes <=70 characters, maybe if we could add a conditions, for example First match of the snippet must be atlist minimum of 400
but not exceeded to 500 characters and esc for not match and stop at first match.
Also if you could explain the flow of the script much better, for not only me but for the benefit of other users.
Hi all,
I am trying to import a database in putty with the syntax:
mysql –u database_username –p database_name < filename.mysql
As you can see in the screenshot it asks me for the database password - which suggests that the syntax is correct - but then after I enter the password it gives... (2 Replies)
Hi all,
Help needed urgently.
I am currently writing a shellscript to read data/record from a flat file (.txt) file, and import/upload the data to oracle database. The script is working fine, but it takes too long time (for 18000 records, it takes around 90 mins).
I guess it takes so long... (1 Reply)
Hi Guys,
I have EMC Storage and from this storage I have maped lun5 to Sun Solaris server and I have created on this lun mount point with name /application
I have anothere Sun Solaris server and I'll colne lun5 to lun10 from storage level so the data of lun5 will be in lun10
how to... (6 Replies)
I have been trying to write a simple snip of bash shell code to import from 1 to 100 records into a BASH array.
I have a CSV file that is structured like:
record1,item1,item2,item3,item4,etc.,etc. .... (<= 100 items)
record2,item1,item2,item3,item4,etc.,etc. .... (<= 100 items)... (5 Replies)
Hi,
I have a requirement for creating a Perl Script which will perform Data Import process in an automated way and I am elaborating herewith :
Section 1 )
- use the following command line format :
"./import.pl -h hostname -p port -f datafile.txt"
Section 2)
datafile.txt will... (3 Replies)
Hi,
I have the file(F1.XL) in Unix Box. it's updating every 1hr.
I would like to import f1.xl to Windows excel sheet, when i need see the reports.
can any one clarify, is there any VB script for importing data from UNIX, like sql connection....
thanks (1 Reply)
Hi,
Is there any way to import data files from Unix system to Windows system?
I have many data files on Unix machine generated every night.
I need to pick certain data from each file and plug them into this windows file on the network share drive.
Anyone has any idea?
Thanks in advance! (8 Replies)
Can Anyone give me an implimentation of virtual memory (simulation using paging only) .it should have the following algos for page replacement
1. LRU
2.FIFO
3.Clock
references to web sites would be gr8 too
it should have the code/algo no executables(in C only) (0 Replies)
HI
I need to import data from a file which is in comressed format
but system doesn't have enough space to uncompress file
Is there any way so that i can do import from compressed file. (4 Replies)