Visit Our UNIX and Linux User Community


Writing fast and efficiently - how ?


 
Thread Tools Search this Thread
Top Forums Programming Writing fast and efficiently - how ?
# 1  
Old 06-03-2002
Question Writing fast and efficiently - how ?

I have a lot of processes all of which need to write quite
a lot of data to the filesystem ( to a single file).
This is managed today in the following way : all the processes
write the data to a shared memory block, which is manged by a process that empties it to a file, thus allowing more space for
writing by the other processes.
It is now argued that this is slow and time- consuming,
specifically in times of high load on the shmem resource
(since each read/write is performed via a lock management
facility).
My question is : Will it be faster to dump the whole idea
of the shared memory and use the buffers that the O.S
provudes ? What type of locking will then be necessary ?

I am working on a AIX , RS6000 machine.

Thank you in advance for your comments !
# 2  
Old 06-03-2002
You don't give us many details. But if this this like a log file, I would simply use the write(2) system call. If many processes open a file in append mode and then issue write's to it (via the system call directly), the writes will be atomic. Each record will be appended to the file in the order that the writes occur. No additional locking is needed. And the file will be buffered in the buffer cache. The syncer and/or the unix write-behind policy will actually write to disk. This is a very simple solution with low overhead and should be very portable.

I don't understand exactly how you're using shared memory. This is also a viable solution. Most database packages use shared memory and the database vendor compete heavily on performance. But they typically have access to secret os facilities designed just for them.

I have never worked much with threads. But what I have read suggests that maybe they would be an option. The theory goes that when processes lean too much on ipc facilities, they should simply become threads and talk via global data.

Ultimately you should probably try several approaches and benchmark them to find the fastest. And sometimes, if there is a lot of processing to do, there is a lot of processing to do. You can't always wave a magic wand and render intense processing trivial.
# 3  
Old 06-04-2002
Question many processes writing to a single file

I have some doubts about the solution you suggested:
I am using Ada83, and writing to the file using Text_Io
which is the standard way to perform basic IO in Ada.
I have no direct access to the system call used to actually
implement the write (although I can of course import the
system call WRITE and call it directly, which is quite ugly
in Ada terms.
The file is actually not a log file but a binary data file,
but I have a log file (Text file) in the system which is
written to directly with no Shared memory buffering,
and I can see instances where a write operation by a process
was interrupted by a call from another process (the log line
fromm the first is cut in half and in the middle lies the line from
the second process), so there is no automatic locking via this call.

The shared memory is needed, as I can understand, to minimize
access to the file for all the processes involved ( there are many),
and instead supply them with a memory area where they dump the data. A single process them accesses the data and pours it in the file. So no harm is caused to the working processes if the IO
is for some reason damaged or stuck.

I am looking for ideas how to reduce mutual wait problems due
to locking of the shared memory resource, maybe by eliminating it
or fragmenting it ?

Thanx
Seeker
# 4  
Old 06-04-2002
Sorry, I don't know ada. I'm not sure if anyone around here does...
# 5  
Old 06-04-2002
About a decade ago I might have been able to help you.

Heck, I haven't even worked on a DoD project in nearly four years...

But just thinking about the problem, can you use a message queue to pass the data to a daemon to write the data at a (on the scale of the microprocessor) later time?

How important is the latency of the data being written?

Sending off a message would free the individual processes fairly quickly and if the writing daemon got behind, it wouldn't affect the individual processes. And because a single process daemon would be doing the writing, you wouldn't have to worry about locking contention.

Just some random thoughts on a lazy Tuesday afternoon.

Last edited by auswipe; 06-04-2002 at 04:26 PM..
# 6  
Old 06-07-2002
Auswipe , in working with IPCs 'shared memory' is always the fastest cause it dosen't much interact with the kernel I/O during its operations countary to others which heavily depend on kernel I/O.

In fact data manipulation is the fastest in respect to 'shared memory' than any other IPC you might talk about.
# 7  
Old 06-09-2002
Bug actual cause of slowness?

Do we really know if the slowness observed is caused by resource contention (many process contenting one SHMEM) or is caused by resource utilisation (the path length of the code doing the SHMEN access)?

If the actual cause is contention, and the process es appeared slow because they spend most of their time waiting, one of the solutions is to have more resouces (multiple SHEM? and pack them up by tge background process).

Previous Thread | Next Thread
Test Your Knowledge in Computers #370
Difficulty: Medium
Bytecode instructions are processed by hardware and so they may be arbitrarily complex.
True or False?

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Purging 2000+ directories efficiently

Hi I have a requirement wherein i need to purge some directories. I have more than 2000 directories where i need to keep data for 10 days and delete the rest. What i am looking for is an efficient way to achieve this. There are four mount points from where i need to delete the files. ... (3 Replies)
Discussion started by: Apoorvbarwa
3 Replies

2. Shell Programming and Scripting

Getting remote variables more efficiently

Hello all, I have a script that has to get variables remotely. Rather than having the script login to the remote server 3 separate times, is there a faster way to get each variable? ##Server comes from input or list## CHKINSTALL=`ssh server "swlist | grep -i program" | grep -v... (2 Replies)
Discussion started by: LinuxRacr
2 Replies

3. UNIX for Dummies Questions & Answers

Efficiently Repeat Text

Hi, Often when I use echo statements in scripts I echo a line of #'s above and below. For example: echo ##### echo hello world echo ##### However, I generally have a series of about 75 #'s. For example: echo #(x 75) echo hello world echo #(X 75) While this helps to delineate... (7 Replies)
Discussion started by: msb65
7 Replies

4. Shell Programming and Scripting

Parse and delete lines efficiently

Hi I have a set of options in the form of key value in a file. Need to find a particular value of 'a' and delete all lines till the next 'a' keyword . Ex : a bbb c ddd e fff g hhh a sss c ggg e xxx f sss a ddd d sss r sss g hhh (5 Replies)
Discussion started by: TDUser
5 Replies

5. Shell Programming and Scripting

How to parse a string efficiently

I am new to the boards and to shell programming and have a requirement to name new files received with a unique sequence number. I need to look at a particular file pattern that exists and then to increment a sequence by 1 and write the new file. Example of file names and sequence # ... (4 Replies)
Discussion started by: sandiego_coder
4 Replies

6. UNIX Desktop Questions & Answers

how to search files efficiently using patterns

hi friens, :) if i need to find files with extension .c++,.C++,.cpp,.Cpp,.CPp,.cPP,.CpP,.cpP,.c,.C wat is the pattern for finding them :confused: (2 Replies)
Discussion started by: arunsubbhian
2 Replies

7. Shell Programming and Scripting

Using xapply efficiently?

Hi all, Were currently using xapply to run multiple ssh instances that then calls a script that returns the PID of a webserver process. Currently we have like 30 xapply statements in a script call checkit which checks various webserver processes on various unix/linux boxes. My question... (0 Replies)
Discussion started by: bdsffl
0 Replies

8. Filesystems, Disks and Memory

Writing fast and efficiently - how ?

I have a lot of processes all of which need to write quite a lot of data to the filesystem ( to a single file). This is managed today in the following way : all the processes write the data to a shared memory block, which is manged by a process that empties it to a file, thus allowing more... (1 Reply)
Discussion started by: Seeker
1 Replies

9. IP Networking

how to use PING command efficiently

Do anyone telle me please how to use PING command to verify connection (TCP/IP) between serveurs. thanks (1 Reply)
Discussion started by: hoang
1 Replies

Featured Tech Videos