SAS or SSD for Ubuntu 14.04 and data analysis


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Old 02-11-2016
SAS or SSD for Ubuntu 14.04 and data analysis

I am in the process of building a workstation and have a question related to performance. I am a scientist who deals with big data (average file size 30-50gb). My OS is ubuntu 14.04 and so far I have a 128gb dual xeon E5-2630 with 6 cores each. I/O buffering is an issue so I am adding a 256/512? PCIe card and either 2 SSD or SAS drives for the OS and software. Since the PCIe will be separate its main purpose will be for file transfer, so would a SAS or SSD be a better fit for the OS? I am leaning towards SAS for the buffering issue, but wanted to ask more knowledgeable users. I forgot to mention that there will be a separate 1 or 2TB drive. Any recommendations for the size of the SAS or SSD? Thanks Smilie.
# 3  
Old 02-12-2016
I access files sequentially. Thank you Smilie.
# 4  
Old 02-12-2016
SSD is more than an order of magnitude (or much) faster than SAS high-rpm disks.
SSD is limited - usually to 1-2 TB of storage. With 128GB of memory, you could easily use SSD disks to load whatever file you want into memory - e.g., usual term is a RAMDISK. Ubuntu supports this. It also caches files very effectively without much human intervention other than configuration.

Learn about pdflush: The Linux Page Cache and pdflush

There is also vmtouch. You can force any file to be read entirely into memory. Which would definitely favor SSD.

https://hoytech.com/vmtouch/ Also note some other tools on that site.

So, I would suggest: SSD's and vmtouch (or an analagous tool.)
This User Gave Thanks to jim mcnamara For This Post:
cmccabe (02-13-2016)
# 6  
Old 02-13-2016
Not directly related but i had a longer workshop yesterday about our new storage system (EMC VMax 200k). EMC claims that they had intended the 300GB 15k-SAS drives for high-performance, but phase them out now because (quoting from memory) with the development of Flash-SSDs its just not worth it any more. They also claim that, because they use SLC-based hardware, they have even lower rates of disk-replacement, even in heavy-duty transactional storage systems, than with rotational disks, to which a much lower energy consumption of the SSDs compared to the 15k-SAS disks contributes. There is simply less heat involved and that shows when you pack some ~2500 disks into a rack.

You haven't said where you are going to place the workstation, but in case it is going to be somewhere near your desk: 15k-disks are awefully LOUD in addition to be premier heating devices while SSDs are completely silent.

I hope this helps.

bakunin
These 2 Users Gave Thanks to bakunin For This Post:
cmccabe (02-13-2016) jim mcnamara (02-13-2016)
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
What should I format my SSD with? mrm5102 UNIX for Dummies Questions & Answers 3 10-28-2014 06:48 PM
SSD Caching, how its done, right choice? postcd Filesystems, Disks and Memory 1 06-12-2014 09:27 AM
RAID 0 for SSD figaro Filesystems, Disks and Memory 6 03-14-2013 06:29 PM
Data analysis, Regular Expression - Unix @man UNIX for Dummies Questions & Answers 2 07-10-2012 10:18 AM
Help with analysis data based on particular column content perl_beginner Shell Programming and Scripting 2 03-22-2012 08:37 AM
Program to test SSD crazydude80 Programming 4 01-11-2012 03:00 PM
What is the best tools for performance data gathering and analysis? devyfong Red Hat 6 12-21-2011 10:08 AM
SSD with GPFS ? zxmaus AIX 0 04-05-2011 09:33 PM
Use of SSD for serving webpages figaro Hardware 2 05-29-2010 05:21 AM
Using SSD in FreeBSD figaro BSD 0 12-06-2009 03:08 AM
help needed- data analysis-table-chart-2d plot software apprentice UNIX and Linux Applications 1 08-14-2009 03:58 AM