My development team has been trying to figure out how to import a unix data dump into SQL Server or convert it into an intermediate file format for several days.
The data dump in question looks like this:
$RecordID: 1<eof>
$Version: 1<eof>
Category: 1<eof>
Poster: John Doe<eof>
ProductName: Test Product<eof>
SKU: 10045689<eof>
Line1: Test Product Line 1 Description<eof>
Line2: Test Product Line 2 Description<eof>
Comments: Test Product Comments<eof>
<eor>
There are nearly 100,000 of records that have nearly 4,000 fields that vary based on the product's category. The field order varies per record within each category. When data does not exist for a given field, the field/value pair is simply excluded for the dump.
We were going to write a parsing application that converted this dump to XML, read it into a dataset, and then uploaded the dataset. About half way through that development, however, I realized that the parsing program would require a minimum of eight gigabytes to run. That obviously won't work.
(100,000 records X 4,000 fields = 400,000,000 fields) X 20 bytes per field name = 8,000,000,000 or 8 billion bytes
Do you know anyone that could tell us an easier way to import this unix dump into SQL? I'm sure there is a standard way of dealing with these dumps, but no one on our team has any experience with unix.
Any help or referral would be GREATLY APPRECIATED. This problem is holding up our entire development process.
Sincerely,
Dalton D. Franklin, MCP
Chief Executive Officer
Simplicity Technology
http://www.simplicitycorp.net
daltonf@simplicitycorp.net
615-327-9797 Telephone
615-985-0060 Fax