SAS or SSD for Ubuntu 14.04 and data analysis

Login or Register to Reply

Thread Tools Search this Thread
# 1  
Old 02-11-2016
SAS or SSD for Ubuntu 14.04 and data analysis

I am in the process of building a workstation and have a question related to performance. I am a scientist who deals with big data (average file size 30-50gb). My OS is ubuntu 14.04 and so far I have a 128gb dual xeon E5-2630 with 6 cores each. I/O buffering is an issue so I am adding a 256/512? PCIe card and either 2 SSD or SAS drives for the OS and software. Since the PCIe will be separate its main purpose will be for file transfer, so would a SAS or SSD be a better fit for the OS? I am leaning towards SAS for the buffering issue, but wanted to ask more knowledgeable users. I forgot to mention that there will be a separate 1 or 2TB drive. Any recommendations for the size of the SAS or SSD? Thanks Smilie.
# 3  
Old 02-12-2016
I access files sequentially. Thank you Smilie.
# 4  
Old 02-12-2016
SSD is more than an order of magnitude (or much) faster than SAS high-rpm disks.
SSD is limited - usually to 1-2 TB of storage. With 128GB of memory, you could easily use SSD disks to load whatever file you want into memory - e.g., usual term is a RAMDISK. Ubuntu supports this. It also caches files very effectively without much human intervention other than configuration.

Learn about pdflush: The Linux Page Cache and pdflush

There is also vmtouch. You can force any file to be read entirely into memory. Which would definitely favor SSD. Also note some other tools on that site.

So, I would suggest: SSD's and vmtouch (or an analagous tool.)
This User Gave Thanks to jim mcnamara For This Post:
cmccabe (02-13-2016)
# 6  
Old 02-13-2016
Not directly related but i had a longer workshop yesterday about our new storage system (EMC VMax 200k). EMC claims that they had intended the 300GB 15k-SAS drives for high-performance, but phase them out now because (quoting from memory) with the development of Flash-SSDs its just not worth it any more. They also claim that, because they use SLC-based hardware, they have even lower rates of disk-replacement, even in heavy-duty transactional storage systems, than with rotational disks, to which a much lower energy consumption of the SSDs compared to the 15k-SAS disks contributes. There is simply less heat involved and that shows when you pack some ~2500 disks into a rack.

You haven't said where you are going to place the workstation, but in case it is going to be somewhere near your desk: 15k-disks are awefully LOUD in addition to be premier heating devices while SSDs are completely silent.

I hope this helps.

These 2 Users Gave Thanks to bakunin For This Post:
cmccabe (02-13-2016) jim mcnamara (02-13-2016)
Login or Register to Reply

Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
What should I format my SSD with? mrm5102 UNIX for Dummies Questions & Answers 3 10-28-2014 06:48 PM
SSD Caching, how its done, right choice? postcd Filesystems, Disks and Memory 1 06-12-2014 09:27 AM
RAID 0 for SSD figaro Filesystems, Disks and Memory 6 03-14-2013 06:29 PM
Data analysis, Regular Expression - Unix @man UNIX for Dummies Questions & Answers 2 07-10-2012 10:18 AM
Help with analysis data based on particular column content perl_beginner Shell Programming and Scripting 2 03-22-2012 08:37 AM
Program to test SSD crazydude80 Programming 4 01-11-2012 03:00 PM
What is the best tools for performance data gathering and analysis? devyfong Red Hat 6 12-21-2011 10:08 AM
SSD with GPFS ? zxmaus AIX 0 04-05-2011 09:33 PM
Use of SSD for serving webpages figaro Hardware 2 05-29-2010 05:21 AM
Using SSD in FreeBSD figaro BSD 0 12-06-2009 03:08 AM
help needed- data analysis-table-chart-2d plot software apprentice UNIX and Linux Applications 1 08-14-2009 03:58 AM