I have huge text files (~120 MB)x100 which equivalents to ~11GB of data. The files contain pure numbers, actually the value of "phi" to 10 billion digits!!
I know its huge!! Here are the last few lines of a file
each line consists of 10x10 digits and at the end the line number. What I want to do is to remove the spaces and the trailing line number and line break. I tried doing that using sed but I keep messing up. I want the output as:
I'm relatively new to shell so if you could add a little explanation so that I could learn too.
Thanks a lot.
---------- Post updated at 08:28 AM ---------- Previous update was at 07:49 AM ----------
Ok, after lot of searching I finally got it:
Where filenames are phi-001.txt, phi-002.txt ..... phi-100.txt
Is there any simpler way to do it?
---------- Post updated at 08:29 AM ---------- Previous update was at 08:28 AM ----------
i tried that, its not removing the newlines and the data " : xxxxx"...
any ways I replaced the spaces with null and imported the files in a mysql database....
now the problem is that querying the database is taking huge time....
So whats the best way to search for a substring in approx 18 GB of data and is 18 GB of text file creation possible ??
yeah I caught my error... Since i wanted 3 digit numbers with leading zeros I had messed it up....
Its working fine now.... Now my question is: "Is creation of a 18 GB file possible?" I'm using x86_64 GNU/Linux Ubuntu 10.10... and what will be the best way to search for a substring in this file ???
Yes it's possible.
check block size of your disks and compare with table.
Suppose it's 4k block size, you will be able to create a file upto ~2TB.
Regarding substrings, you can use awk substr to print substrings.
If you can tell what are you trying to accomplish folks here will probably suggest the best way.
What I'm trying to do is to search the 10billion digits of phi a.k.a the golden ratio for number patterns as it is said that phi will have any number series 0provided you look long enough.
Then once a efficient search function is done which will check even for repeated occurrences, it will be used to derive mathematical statistics about numbers present and more over it, which I havent thought yet. Maybe linking it will the stats available like probability and its relation etc.
One method that I think will be to use multi-threaded application so as to quicken the process and use less RAM.
Hi Folks,
I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file.
bash scripting is what i need.
Data.txt
Name OrderID Quantity
Sam 123 300
Jay 342 498
Kev 78 2500
Sam 420 50
Vic 10... (3 Replies)
How do I output only the first 400 bytes of a huge text file to a new file.
It has to be unmodified so no added invisible characters.
Many thanks..... (3 Replies)
Say I had a text file that contained four columns, like the following:
Mack Christopher:237 Avondale Blvd:970-791-6419:S
Ben Macdonor:30 Dragon Rd:647-288-6395:B
I'm making a loop that will replace the fourth column a line in the file with the contents of a variable 'access', but I have no... (6 Replies)
I'm trying simple functionality of replacing the second line of files with some other string.
Problem is these files are huge and there are too many files to process.
Could anyone please suggest me a way to replace the second line of all files with another text in a fastest possible manner.
... (2 Replies)
Hi everyone,
I'm having trouble figuring this one out. I have ~100 *.fa files with multiple lines of fasta sequences like this: file1.fa
>xyzsequence
atcatgcacac......
ataccgagagg.....
atataccagag.....
>abcsequence
atgagatatat.....
acacacggd.....
atcgaacac....
agttccagat....
The... (2 Replies)
I'm trying to change the ramfs size in kernel .config automatically.
I have a ramfs_size file generated with du -s
cat ramfs_size
64512
I want to replace the linux .config's ramdisk size with the above value
CONFIG_BLK_DEV_RAM_SIZE=73728
Right now I'm doing something dumb like: ... (3 Replies)
Hi all,
Very first post on this forums, hope you can help me with this scripting task.
I have a big text file with over 3000 lines, some of those lines contain some text that I need to replace, lets say for simplicity the text to be replaced in those lines is "aaa" and I need it to replace it... (2 Replies)
Hi Guys,
I am needing some help writing a shell script to replace the following in a text file
/opt/was/apps/was61
with some other path eg
/usr/blan/blah/blah.
I know that i can do it using sed or perl but just having difficulty writing the escape characters for it
All Help... (3 Replies)
i need help..!!!!
i have one big text file estimate data file size 50 - 100GB with 70 Mega Rows.
on OS SUN Solaris version 8
How i can remove first line of the text file.
Please suggest me for solutions.
Thank you very much in advance:) (5 Replies)