For performance, write it in C

C

Chad Perrin

No, just that you left out the bit about JIT compilation into native
code.

Gee willickers, I'm sorry I didn't use the exact phrasing you wanted me
to. Maybe next time, though, you won't claim the bits I said that
didn't actually have anything to do with your actual complaint were
wrong.

Y'know, screw it. Be an ass if you like. I'm done with this subthread.
 
P

Peter Hickman

Tim said:
So, first gripe: C is faster than Ruby *in certain problem domains*.
In others, it's not.
The post was about people wanting better performance for their code.
Quite clearly if the code you have written in Ruby (or whatever) runs
fast enough for you then performance is a non issue. If however the
performance of your code is an issue then in truth there is only so much
improvement that you can squeeze out of Ruby, if that is enough to
resolve your performance issues then fine. If you want still more
performance then you want to write it in C (or perhaps buy some new
hardware :) )
Second gripe. The notion of doing a wholesale rewrite in C is almost
certainly wrong.
An earlier project of mine used GD from Ruby to calculate some colour
metrics from images and write them into a database. I rewrote the whole
thing in C, using the same GD and SQLite2 libraries as the Ruby version,
and the improvement was massive. Despite the fact that the Ruby was not
actually doing very much. Most of the time was spent in the GD library,
so I am not all that convinced that rewritting part of a project in C
will achieve quite the same improvement. And if you are going to convert
a significant chunk of code to C then you may as well go the whole hog.
In fact, the notion of doing any kind of serious hacking, without
doing some measuring first, is almost always wrong. The *right* way
to build software that performs well is to write a natural, idiomatic
implementation, trying to avoid stupid design errors but not worrying
too much about performance. If it's fast enough, you're done. No problem here.
If it's not fast enough, don't write another line of code till you've
used used a profiler and understand what the problem is. If in fact
this is the kind of a problem where C is going to do better, chances
are you only have to replace 10% of your code to get 90% of the
available speedup.
Not been my experience to date but then perhaps I am not working on
problems that can be solved in that way.
 
P

Pit Capitain

Ryan said:
Others have reported being able to use inline on windows... why can't you?

Ryan, googling for

"ruby inline" windows

gave me no usable hint among the first 50 results besides using cygwin.
Do you have a link to the reports you mention?

Maybe I should have written that giving that I'm using the One Click
Installer, don't have the Windows compiler toolchain, and am not willing
to use cygwin, I can't use Ruby Inline. Is this better?

But, instead I might try to use Ruby Inline with MinGW, so thanks for
the question.

Regards,
Pit
 
C

Csaba Henk

-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row. It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
addnum [] _ = []
addnum (col:cs) prev
| x =:= upto n &
(x `elem` prev) =:= False &
(x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
where x free

latin_square n = latin_square_ n
where latin_square_ 0 = replicate n [] -- initalize columns to nil
latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
------------------------- end latin.curry -----------------------------

It's really nice and compact!
AFAIK Curry is Haskell boosted with logic programming.

I -- who, ATM, just watches these languages from a distance, and can't
tell it by looking at the code -- wonder if have you used here
something specific to Curry, which would be harder/uglier to express in
Haskell?

And how the Curry compiler looks like? Is it just a hacked GHC? How
Curry performance relates to that of Haskell?

Regards,
Csaba
 
J

Jan Svitok

Well, you need the compiler chain if you want to compile, that is what
inline does.

On windows, you have three options:

- MS - you can get by with their free compiler (VS Express or something)
- cygwin
- mingw

I have full VS, and inline worked for me when I started the program
with proper environment (=proper paths set), although I've tried only
the examples. And it seems VC6 and VC7 (VS2003) are better to use due
to the manifest stuff that VC8 (VS2005) creates.

I haven't tried cygwin or mingw.

J.
 
W

William James

Kristof said:
-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row. It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
addnum [] _ = []
addnum (col:cs) prev
| x =:= upto n &
(x `elem` prev) =:= False &
(x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
where x free

latin_square n = latin_square_ n
where latin_square_ 0 = replicate n [] -- initalize columns to nil
latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
------------------------- end latin.curry -----------------------------

I don't see where elems_diff is used after it is defined.
 
K

Kristof Bastiaensen

Kristof said:
-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row. It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
addnum [] _ = []
addnum (col:cs) prev
| x =:= upto n &
(x `elem` prev) =:= False &
(x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
where x free

latin_square n = latin_square_ n
where latin_square_ 0 = replicate n [] -- initalize columns to nil
latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
------------------------- end latin.curry -----------------------------

I don't see where elems_diff is used after it is defined.

Heh, you are right! I defined it, and then when I didn't need it I forgot
to remove it.

Thanks for noting,
Kristof
 
M

M. Edward (Ed) Borasky

Pit said:
Maybe I should have written that giving that I'm using the One Click
Installer, don't have the Windows compiler toolchain, and am not
willing to use cygwin, I can't use Ruby Inline. Is this better?
Speaking of CygWin, a couple of people here have expressed what seems
like disdain for it. I am constrained to use a Windows desktop at my day
job, and CygWin has been an important factor in my retaining my sanity
about the fact. I don't use the server pieces of CygWin. My preference
(in open source tools) is first native Windows, second CygWin and third
(Gentoo) Linux. I was dual booted with Gentoo for a while until the
VMware Server beta started. That became a viable option so that's how I
exercise the third option now.

So what is the source of the reluctance to use CygWin in the Ruby community?
 
K

Kristof Bastiaensen

-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row. It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
addnum [] _ = []
addnum (col:cs) prev
| x =:= upto n &
(x `elem` prev) =:= False &
(x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
where x free

latin_square n = latin_square_ n
where latin_square_ 0 = replicate n [] -- initalize columns to nil
latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
------------------------- end latin.curry -----------------------------

It's really nice and compact!
AFAIK Curry is Haskell boosted with logic programming.

Yes, exactly!
I -- who, ATM, just watches these languages from a distance, and can't
tell it by looking at the code -- wonder if have you used here
something specific to Curry, which would be harder/uglier to express in
Haskell?

Yes, the =:= operator unifies terms like in logic languages, and curry
makes it possible to write nondeterministic functions. For example the
upto function I defined above can evaluate to any number from 1 upto n,
while in haskell it could have only one result. In the code that I wrote
above:
upto n | n > 1 = n ? upto (n-1)

is the same as
upto n | n > 1 = n
upto n | n > 1 = upto (n-1)

Then there are search functions that make it possible to extract all outcomes
from a nondeterministic function in a lazy way (i.e. findall)

In haskell the above would probably be written in a monad that expresses
nondeterminism, but I doubt it will be as clear as the Curry code.
And how the Curry compiler looks like? Is it just a hacked GHC? How
Curry performance relates to that of Haskell?

As far as I know the Curry compiler I used (Munster CC) is written from
scratch, in Haskell. I doubt it is as fast and optimized as the Haskell
compiler, since Haskell has a much large userbase.
Regards,
Csaba

Regards,
Kristof
 
K

Keith Gaughan

Gee willickers, I'm sorry I didn't use the exact phrasing you wanted me
to. Maybe next time, though, you won't claim the bits I said that
didn't actually have anything to do with your actual complaint were
wrong.

Y'know, screw it. Be an ass if you like. I'm done with this subthread.

I was being polite until you made a passive agressive remark about me
being "less than sporting". But if you want to act like that, there's
not much I can do to stop you.
 
A

Ashley Moran

Next time I get a morning free I might apply some of the tweaks that have
been suggested. =C2=A0Might be interested to see how much =C2=A0I can imp= rove the
performance.


I looked at the source of the script today, and I made these changes:

=2D use FasterCSV instead of CSV
=2D don't buffer every row in the datasets, send them straight to=20
Zlib:GzipWriter as they are processed
=2D don't do hash lookups in the middle of a 15 million row loop, do them o=
nce=20
in advance!

Unfortunately I'm still stuck with a nasty "rows.each { |row| row.each { |c=
ol|=20
col.strip! } }" type section, to fix the poor quality of the data, which=20
would take a lot of time going through all the fields to thin out.

Despite this, I've got the run time down from over 2.5 hours to about 50=20
minutes. The smaller files are individually about 6x faster, but I'm happy=
=20
with 3x faster overall. It means we can realistically run it in the day if=
=20
there are issues.

One curious thing is that while the real time was about 50 mins, the user t=
ime=20
was only about 30 mins (negligible sys time if I remember). Not sure where=
=20
the other 20 mins has gone?

Ashley

=2D-=20
"If you do it the stupid way, you will have to do it again"
- Gregory Chudnovsky
 
K

Keith Gaughan

Design decisions that involve interfacing with interface software that
sucks is related to the software under discussion -- and not all of the
interface is entirely delegated to Windows, either. No software can be
evaluated for its performance characteristics separate from its
environment except insofar as it runs without that environment.

Here's all I'm saying: the environment is important, but it's a variable
that must be cancelled when talking about some piece of software that's
running on top of it. You can only make judgements about the speed of
something like Excel by comparing it to another spreadsheet with a
similar set of features running on Windows. Otherwise, you're only
making guesses as to where the sluggishness and bloat lie.
Actually, no, it's not an emulator.

Yes, it is. It's a set of libraries and executables that emulate a
Windows environment.
It's a set of libraries (or a
single library -- I'm a little sketchy on the details) that provides the
same API as Windows software finds in a Windows environment. An
emulator actually creates a faux/copy version of the environment it's
emulating.

Which both Wine and Cygwin do. To quote the Wikipedia article on
emulators:

A software emulator allows computer programs to run on a platform
(computer architecture and/or operating system) other than the one for
which they were originally written.

Linux compatibility on FreeBSD is a software emulator that fools Linux
executables into thinking they're running on Linux. Because of the
commonalities between FreeBSD and Linux, this emulation layer can be
thin.
It is to Linux compared with Unix as an actual emulator is
to Cygwin compared with Unix: one is a differing implementation and the
other is an emulator.
?

. . . and, in fact, there are things that run faster via Wine on Linux
than natively on Windows.

Not surprising, really.
[ snip ]
under FreeBSD. Bringing Wine in is a red herring. Software cannot be
blamed for the environment it's executed in.

I didn't bring it up. You did. I made a comment about Excel not
working in Linux as a bit of a joke, attempting to make the point that
saying Excel performance can be evaluated separately from its dependence
on Windows doesn't strike me as useful.

See above.

--
Keith Gaughan - (e-mail address removed) - http://talideon.com/
Abbott's Admonitions:
1: If you have to ask, you're not entitled to know.
2: If you don't like the answer, you shouldn't have asked
the question.
-- Charles Abbot, dean, University of Virginia
 
C

Chad Perrin

I was being polite until you made a passive agressive remark about me
being "less than sporting". But if you want to act like that, there's
not much I can do to stop you.

That wasn't a passive-aggressive remark, it was a joking comment about
the inequity of your comparison (intentional or otherwise). You're
welcome to your misconceptions and bad attitudes, though.
 
B

Bill Kelly

From: "Chad Perrin said:
That wasn't a passive-aggressive remark, it was a joking comment about
the inequity of your comparison (intentional or otherwise). You're
welcome to your misconceptions and bad attitudes, though.

GENTLEMEN! YOU CAN'T FIGHT IN HERE, THIS IS THE WAR ROOM !!!
 
K

Keith Gaughan

That wasn't a passive-aggressive remark, it was a joking comment about
the inequity of your comparison (intentional or otherwise).

It didn't come across as joking.
You're welcome to your misconceptions and bad attitudes, though.

Ditto. I wan't the one who wrote "Y'know, screw it. Be an ass if you
like." I hadn't even considered flipping the bozo bit until I read that.
 
C

Chad Perrin

Here's all I'm saying: the environment is important, but it's a variable
that must be cancelled when talking about some piece of software that's
running on top of it. You can only make judgements about the speed of
something like Excel by comparing it to another spreadsheet with a
similar set of features running on Windows. Otherwise, you're only
making guesses as to where the sluggishness and bloat lie.

What's important is how two pieces of software run in the same
environment, not whether the environment is the reason a given
application is slow at some things. That was my point: the GUI
performance of Excel is, indeed, relevant to a discussion of Excel
performance, despite the fact that significant chunks of Excel's GUI is
implemented by way of the environment. Compare it with another
spreadsheet running in the same environment, and don't cancel some of
its slowness by blaming it on Windows.

Yes, it is. It's a set of libraries and executables that emulate a
Windows environment.

No, it's not. Repeat after me: "WINE Is Not an Emulator". That's not
just an affectation. It is a statement of fact about WINE. That's why
they call it WINE. An Windows emulator would be a "fake Windows"
running in Linux, like a VM: WINE is basically just an API that happens
to be as close to the Windows API (in all useful ways) as the WINE
developers can get it. It does not pretend to be a Windows machine. It
just provides compatibility for Windows programs on Linux.

Perhaps you aren't aware that WINE stands for WINE Is Not an Emulator,
or that they aren't lying whey they say that.


(where ~ means "roughly equivalent to")

Differing implementations:
Wine ~ Windows
Linux ~ Unix

Emulators:
Emulator != Original
Cygwin != Linux
 
J

James Edward Gray II

I looked at the source of the script today, and I made these changes:

- use FasterCSV instead of CSV

Fair warning, I'm coming into this conversation late and I haven't
read all that came before this. However, if you are using FasterCSV...
Unfortunately I'm still stuck with a nasty "rows.each { |row|
row.each { |col|
col.strip! } }" type section, to fix the poor quality of the data,
which
would take a lot of time going through all the fields to thin out.

FasterCS can convert fields as they are read. I'm not sure if this
will be faster, but it may be worth a shot. See the :converters
argument to FasterCSV::new:

http://fastercsv.rubyforge.org/

James Edward Gray II
 
C

Chad Perrin

It didn't come across as joking.

I'd be inclined to apologize for the misunderstanding if you hadn't
decided I was the devil incarnate for a mis-taken joke.

Ditto. I wan't the one who wrote "Y'know, screw it. Be an ass if you
like." I hadn't even considered flipping the bozo bit until I read that.

Reread what you said in the preceding posts and tell me if you wouldn't
have the same reaction to someone flying off the damned handle at a
stupid joke. I tried to inject levity because I could see the
conversation heading in a bad direction, and you were so intent on
seeing me in a bad light that it didn't occur to you to assume good
faith on my part. Congratulations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top