Semaphores or not?

Deniz Dogan · Sep 24, 2006

Hello.

I am currently working on a small project (as a hobby) and part of it is
having a JTable with five columns (no need to know more about the actual
columns than the fact that there are five of them). What it's supposed
to show is information from two different text files, let's call them
original.txt and translation.txt on separate columns. Both text files
are formatted as such:

//----------------------------

The text files are separated into paragraphs,
such as this one.

There may or may not be multiple rows right next to each other,
and there is no way of knowing how many rows there are before
a new empty line (the end of a paragraph).

Both of the files have this format.
Both the files have the same amount of paragraphs.

//----------------------------

What I need to do is capture each so called paragraph of each file into
separate Strings, where I replace '\n' with '|' and that I add into the
table data, on two separate columns.

Now to my question:
Is it better to create two threads which read information from each file
and use Semaphore based barrier synchronization to change values of a
row of the table only once (with both the information from original.txt
and translation.txt at once) ...OR... is it better to first sequentially
read original.txt and then use table.setValueAt(Object, int, int) to set
the correct values from translated.txt?

Phew, I hope you understand what I'm trying to say.

/Deniz Dogan

Eric Sosman · Sep 24, 2006

Deniz said:
Hello.

I am currently working on a small project (as a hobby) and part of it is
having a JTable with five columns (no need to know more about the actual
columns than the fact that there are five of them). [... and that they
> get filled with data from different input files ...]

Now to my question:
Is it better to create two threads which read information from each file
and use Semaphore based barrier synchronization to change values of a
row of the table only once (with both the information from original.txt
and translation.txt at once) ...OR... is it better to first sequentially
read original.txt and then use table.setValueAt(Object, int, int) to set
the correct values from translated.txt?

As far as I can see, the only advantage of deferring a bunch
of updates and doing them in one batch would be that you might
avoid firing some events. I doubt that will be a big savings,
but you should measure it if you need to be sure.

Patricia Shanahan · Sep 24, 2006

Deniz Dogan wrote:
....

Is it better to create two threads which read information from each file
and use Semaphore based barrier synchronization to change values of a
row of the table only once (with both the information from original.txt
and translation.txt at once) ...OR... is it better to first sequentially
read original.txt and then use table.setValueAt(Object, int, int) to set
the correct values from translated.txt?

Phew, I hope you understand what I'm trying to say.

I'm assuming the processing is reasonably simple.

If the files are not cached in memory at the time the job runs, it
should run at disk read speed. If both files are on the same disk,
reading one file at a time will tend to be faster because it reduces
disk head movement. If they are on different disks reading them in
parallel may be faster than one at a time, because you get to make
effective use of both disk heads.

If the files are cached in memory, the job's CPU time becomes the
critical factor. On a dual processor, or higher, you may get some gain
from running two threads. However, there is a risk that the chunks of
parallel work may be too small, and the cost of synchronization too
high, for a net gain. Also, you may find data contention between the two
processors messes up caching, reducing performance.

Even if you have separate disk drives or a dual processor, I would start
with the simple single thread implementation, and only consider going to
two threads if this turns out to be performance critical relative to the
whole program.

Patricia

Deniz Dogan · Sep 24, 2006

Patricia said:
Deniz Dogan wrote:
...

I'm assuming the processing is reasonably simple.

If the files are not cached in memory at the time the job runs, it
should run at disk read speed. If both files are on the same disk,
reading one file at a time will tend to be faster because it reduces
disk head movement. If they are on different disks reading them in
parallel may be faster than one at a time, because you get to make
effective use of both disk heads.

If the files are cached in memory, the job's CPU time becomes the
critical factor. On a dual processor, or higher, you may get some gain
from running two threads. However, there is a risk that the chunks of
parallel work may be too small, and the cost of synchronization too
high, for a net gain. Also, you may find data contention between the two
processors messes up caching, reducing performance.

Even if you have separate disk drives or a dual processor, I would start
with the simple single thread implementation, and only consider going to
two threads if this turns out to be performance critical relative to the
whole program.

Patricia

Thank you for your response, Patricia! That was pretty much what I was
thinking as well, and the code actually gets prettier when I read them
sequentially. I'll stick to that way as for now!

/Deniz Dogan

I need some help on a format issue that should be simple for someone here (but not me!)	0	Jul 6, 2023
Python Threads and C Semaphores	2	Jan 16, 2007
CSS Grid inside slider not working...	1	Nov 28, 2022
Show/Hide HTML Table Columns using SELECT option	1	Nov 9, 2020
Help with my responsive home page	2	Dec 14, 2022
A more efficient code	1	Apr 11, 2022
Anyone familiar with WP Bakery and/or Visual Composer?	4	Jan 27, 2023
I need help in understanding these files on my phone, Could someone help me understand these files? Urgent help needed. Please help.	1	Jun 4, 2023

Semaphores or not?

Deniz Dogan

Eric Sosman

Patricia Shanahan

Deniz Dogan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads